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INTRODUCTION 


The  subject  of  this  research  is  the  study  of  gastric  cancer,  where  the  purpose  is  to  reveal  new 
insights  into  the  biology  of  the  disease  that  could  potentially  have  therapeutic  implications. 
Specifically,  the  scope  of  the  study  is  based  on  3  broad  objectives:  (i)  identification  of  dysregulated 
and  susceptible  pathways,  as  well  as  their  novel  inter-relationships,  in  gastric  adenocarcinoma 
(GAC);  (ii)  Pan-Cancer  comparison  of  GAC  with  other  cancers  to  leverage  therapeutic  target 
information  across  cancers;  (iii)  Identification  of  novel  therapeutic  targets,  both  with  and  without 
currently  known  drugs  that  target  them.  We  have  identified  novel  interactions  amongst  pathways 
in  stomach  and  other  cancers,  where  we  have  identified  certain  sub-groups  of  stomach  cancer 
patients  where  those  pathways  may  be  exceptionally  abnormal  and  lead  to  worse  survival.  The 
interactions  and  corresponding  sets  of  genes  may  be  targetable  by  existing  drugs  and/or  drugs 
under  development  for  treatment  of  that  sub-group  of  stomach  cancer  patients.  Other  sub-groups 
have  other  interactions  and  genes  that  may  also  be  targeted  using  different  drugs.  In  that  way, 
we  can  potentially  give  customized  regimens  of  drugs  to  specific  patients  whose  cancers  exhibit 
targetable  characteristics.  The  objective  of  this  research  is  to  find  those  targets  and  improve  our 
understanding  of  the  biology  of  stomach  cancer. 

KEYWORDS 

Gastric  cancer,  stomach  cancer,  gastrointestinal  cancers,  pathway  aberrations,  gastric  cancer 
therapeutic  targets,  Pan-Cancer,  dysregulated  pathways,  gastric  cancer  subtypes. 

ACCOMPLISHMENTS 

What  were  the  major  goals  of  the  project? 

The  major  goals  of  the  project  and  their  breakdown  into  milestones  (as  stated  in  the  approved 
SOW)  are  shown  below.  Also  shown  is  the  percentage  of  completion  for  the  milestones,  to  date. 
Please  note  that  the  milestones  have  not  been  completed  in  the  originally  proposed  chronological 
order  and  Specific  Aims  2  &  3  were  worked  on  before  Specific  Aim  1 .  That  was  due  to  external 
factors  beyond  the  Pi’s  control,  as  described  under  the  section  “CHANGES/PROBLEMS.” 
However,  even  though  the  order  of  completion  has  changed,  the  tasks  themselves  and  the 
amount  of  time  needed  to  complete  them  have  not  changed. 


Specific  Aim  1 

Timeline 

Percentage  Completed 

Major  Task  1:  Acquisition  and  Quality  Control  of 
gastric  cancer  data,  in  preparation  for 
computational  analysis 

Months 

Acquire  TCGA  gastric  cancer  data,  and  in-house  MD 
Anderson  data  (after  procuring  necessary  approvals) 

0.5 

Convert  all  acquired  data  into  a  “standardized”  format 
suitable  for  computational  analysis 

1 

Assess  and  remove  (if  needed)  batch  effects  from 
within  TCGA  and  within  MD  Anderson  data,  and 

2 

4 


improve  the  overall  quality  of  each  data  set 
individually. 

Merge  TCGA  data  with  MD  Anderson  data, 
removing  batch  effects  across  both  data  sets. 

2.5 

Re-assess  the  quality  of  the  overall  data  and  iterate 
back  to  previous  steps,  if  needed,  until  data  are 
satisfactory 

3 

Milestone(s)  Achieved:  “Cleaned  up”  gastric  cancer 
data  from  TCGA  and  MD  Anderson  ready  for 
computational  analysis 

3 

Major  Task  2:  Computational  analysis  of  the  gastric 
cancer  data  sets 

Cluster  the  data  sets  and  study  the  results 

4 

Generate  pathway  activity  scores  for  various  pathways 
across  multiple  data  types,  and  determine  which  ones 
are  likely  disrupted 

5-6 

Correlate  disrupted  pathways  across  multiple  data 
types  (e.g.  transcriptomic,  proteomic,  genomic, 
epigenomic)  and  across  clinical  variables  (e.g. 
histology,  stage,  grade,  outcome)  via  statistical  analysis 

7-8 

Milestone(s)  Achieved:  First  round  computational 
analysis  for  gastric  cancer  completed 

8 

Major  Task  3:  Publish  gastric  cancer  results 

Discuss  results  with  collaborators  and  perform  any 
follow  up  analysis 

9-10 

Write  one  or  more  manuscript(s)  with  input  from 
designated  mentor  and  collaborators 

11-12 

Submit  manuscript(s)  and  wait  for  reviews. 

13-14 

Respond  to  reviewers  and  resubmit.  May  repeat 
submission/resubmission  process  with  multiple  journals 
depending  on  where  the  paper(s)  end  up  being 
published. 

Present  results  at  conferences. 

15-19 

Milestone(s)  Achieved:  Manuscript(s)  published 

19 

Total  time  for  Specific  Aim  1 

19 

5 


Specific  Aim  2 

Major  Task  4:  Acquisition  and  Quality  Control  of 
Pan-GI  data,  in  preparation  for  computational 
analysis 

Acquire  TCGA  Pan-GI  data  and  convert  them  into  a 
“standardized”  format  suitable  for  computational 
analysis 

20 

100% 

) 

Assess  and  remove  (if  needed)  batch  effects  from  the 
data,  and  improve  the  quality  of  the  data 

21 

100% 

Milestone(s)  Achieved:  “Cleaned  up”  TCGA  Pan-GI 
data  ready  for  computational  analysis 

21 

100% 

) 

Major  Task  5:  Computational  analysis  of  the  Pan-GI 
data  sets 

Cluster  the  data  sets  and  study  the  results 

22 

100% 

Generate  pathway  activity  scores  for  various  pathways 
across  multiple  data  types,  and  determine  which  ones 
are  likely  disrupted 

23-24 

80% 

Correlate  disrupted  pathways  across  multiple  data 
types  (e.g.  transcriptomic,  proteomic,  genomic, 
epigenomic)  and  across  clinical  variables  (e.g. 
histology,  stage,  grade,  outcome)  via  statistical  analysis 

25-26 

80% 

Compare  gastric  with  other  Pan-GI  cancers  and  look  for 
similarities  and  differences 

27 

100% 

) 

Milestone(s)  Achieved:  First  round  computational 
analysis  for  Pan-GI  cancers  completed 

27 

90% 

Major  Task  6:  Publish  Pan-GI  cancer  results 

Discuss  results  with  collaborators  and  perform  any 
follow  up  analysis 

28-29 

100% 

Write  one  or  more  manuscripts  with  input  from 
designated  mentor  and  collaborators 

30-31 

100% 

Submit  manuscript(s)  and  wait  for  reviews 

32-33 

100% 

Respond  to  reviewers  and  resubmit.  May  repeat 
submission/resubmission  process  with  multiple  journals 
depending  on  where  the  paper(s)  end  up  being 
published. 

Present  results  at  conferences. 

34-36 

50% 

6 


Milestone(s)  Achieved:  Pan-GI  manuscript(s) 
published 

36 

87.5% 

Total  time  for  Specific  Aim  2 

36 

Specific  Aim  3 

Major  Task  7:  Identification  and  publication  of 
potential  therapeutic  targets  in  gastric  cancer 

Identify  potential  genes  and/or  pathways  in  gastric 
cancer  for  targeted  therapy,  using  gastric  data  only  from 
Aim  1 

9-10 

Identify  potential  genes  and/or  pathways  in  gastric 
cancer  for  targeted  therapy,  using  cross-tumor 
information  from  Pan-GI  cancers  from  Aim  2 

28-29 

100% 

Integrate  results  into  manuscripts  for  Specific  Aims  1 
and  2.  Present  results  at  conferences. 

30-36 

50% 

Milestone(s)  Achieved:  Potential  therapeutic  targets 
identified  and  published 

36 

50% 

Total  time  for  Specific  Aim  3  (interspersed  with 
other  aims;  not  consecutive  months) 

36 

What  was  accomplished  under  these  goals? 

Specific  Aim  2,  Major  Task  4:  Acguisition  and  Quality.  Control  of_  ParhGj  data ,  in  preparation  for 

computational  analysis 

That  aim  has  been  fully  completed  and  the  following  milestone  has  been  achieved: 
“Cleaned  up”  TCGA  Pan-GI  data  ready  for  computational  analysis.  In  fact,  not  only  have  the 
Pan-GI  data  been  adjusted  for  batch  effects  and  standardized,  but  data  across  all  33  TCGA  tumor 
types  have  been  adjusted.  Specifically,  Dr.  Akbani’s  lab  adjusted  mRNA,  miRNA  and  protein  data, 
and  the  adjusted  datasets  are  being  used  by  the  TCGA  PanCanAtlas  project  as  the  “official” 
cleaned  up  datasets.  The  adjusted  data  are  currently  available  at  the  password  protected 
Synapse.org  page  for  PanCanAtlas,  but  they  will  be  released  to  the  public  at  the  Genomic  Data 
Commons  portal  (gdc.cancer.gov)  once  the  PanCanAtlas  papers  have  been  published  in  Spring 
2018. 

The  following  figures  illustrate  examples  of  batch  effects  that  were  found  in  the  TCGA  mRNA  and 
miRNA  data,  as  well  as  the  results  after  correction  by  Dr.  Akbani’s  lab.  (Key:  gastric  (STAD),  colon 
(COAD),  rectal  (READ),  esophageal  (ESCA),  AML  (LAML),  endometrial  (UCEC)  cancers, 
Stratagene  reference  (Strat)). 
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A 


cancerType 

|BCGSC_Strat  -  (n=  5.  0%) 

□  COAD  -  (n=  463,  23%) 
QESCA  -  (n=  182,  9%) 

□  LAML-  (n=  173,  9%) 

□  READ  -(n=  156,  8%) 
HSTAD-(n=450,23%) 
HUCEC  -  (n=  530,  27%) 
HUNC_Strat  -  (n=  34.  2%) 

□  Missing  Value  -  (n*0.  0%) 


platform 

■GA-(n»855.43%) 

□  HiSeq  -(n=  1138.  57%) 

□  Missing  Value  -  (n=0.  0%) 


Center 

BBCGSC-  (n-810,  41%) 
□UNC  -  (n*  1 183,  59%) 

□  Missing  Value  -  (n»0.  0%) 


Replicates 

■  BCGSCrep  -  (n=  38.  2%) 
□UNCrep  *  (n*40.  2%) 

□Missing  Value  -  (n=  1915.  96%) 


r~  h 

cancerType 

platform 

1 

Center 

Replicates 

- ■: 

'  it  If. 

H 

fctiaaiiii  imi  wsuc&'ii  Viaiswii 

■  ■  ■ 

C 


cancerType 

|BCGSC_Strat  -  (n=  5.  0%) 

□  COAD  -  (n=  463.  23%) 

□  ESCA  -  (n=  182,  9%) 

□  LAML  -  (n=  173,  9%) 

□  READ -(n=  156.  8%) 
(|STAD  -  (n=  450,  23%) 
>UCEC-(n=  530,  27%) 
BUNC_Strat  -  (n=  34,  2%) 
□Missing  Value  -  (n=0.  0%) 


platform 

■GA-(n=855.  43%) 

□  HiSeq -(n=  1138.  57%) 
□Missing  Value  -  (n=0.  0%) 


Center 

■  BCGSC-(n*810.  41%) 

□  UNC  -  (n*  1183.  59%) 

□  Missing  Value  -  (n=  0.  0%) 


Replicates 

■BCGSCrep  -  (n=  38.  2%) 

□  UNCrep  -  <n=  40.  2%) 

□Missing  Value  -  (n=  1915,  96%) 


Fig.  1.  (A)  Clustered  heat  map  of  mRNA  data  with  genes  in  rows  and  samples  in  columns  (red  =  high,  white 
=  medium,  blue  =  low  expression).  Large  batch  effects  by  platform  can  be  seen  in  the  green  rectangles  in 
COAD,  READ,  UCEC  and  Strat  data.  Subtler  batch  effects  by  platform  are  observed  in  gastric  (STAD)  data 
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(black  rectangles).  (B)  Zoomed  in  view  of  the  black  rectangle  in  A,  showing  expression  of  certain  genes 
corresponds  with  platform  variable,  demonstrating  a  batch  effect.  (C)  Clustered  heat  map  of  the  batch 
effects  adjusted  data.  Batch  effects  by  platform  have  been  mitigated  and  the  platform  types  can  be  seen  to 
merge  together  (green  and  black  rectangles). 


A 

Disease 


-j 


H 


K 


-w-1 


■  I 


-  i  ... 

IP 


— 


HACC-(n=79,  1%) 
HBLCA-(n=429.  4%) 
BBRCA-(n=  1165.  11%) 
|CESC  -  (n=  311,  3%) 
BCHOL-(n=45.  0%) 
□COAD  -  (n=431. 4%) 

□  DLBC-  (n=47.  0%) 

□  ESCA-  (n=  195.  2%) 
□GBM  -  (n=  5,  0%) 

□  HNSC  -  (n=  565,  5%) 

□  KICH  -  (n=89.  1%) 
HKIRC-  (n=570,  5%) 

□  KIRP-  (n-321. 3%) 

□  LAML  -  (n=  188,  2%) 

□  LGG  -  (n=  528,  5%) 

□  LIHC  -  (n*  421 , 4%) 

□  LUAD  -  (n=  555.  5%) 

□  LUSC-  (n-511. 5%) 

□  MESO  -  (n=  87.  1%) 

□  OV  -  (n=  486.  4%) 
HPAAD  -  (n=  182.  2%) 
HPCPG  -  (n=  186,  2%) 
BPRAD  -  (n=  544.  5%) 
BREAD  -  (n=  156.  1%) 
HSARC  -  (n=  260,  2%) 

■  SKCM  -  (n=  452.  4%) 
BSTAD  -(n=474.  4%) 
HTGCT-  (n=  155.  1%) 
BTHCA-  (n=  569.  5%) 
BTHYM  -  (n=  126.  1%) 
HUCEC-  (n=556.  5%) 
HUCS -(n*  56.  1%) 
HUVM  -  (n=  80.  1%) 


B 


Protocol 

Haired  -  (n=  1392.  13%) 

□  MultiMACs  -  (n=  9432,  87%) 

□  Missing  Value  -  (n=0,  0%) 


Platform 

HGA-(n=1413,  13%) 

□  HiSeq  -  (n=  9411. 87%) 

□  Missing  Value  -  (n*0,  0%) 


Fig.  2.  (A)  Clustered  heat  map  of  miRNA  data  with  miRNAs  in  rows  and  samples  in  columns  (red  =  high, 
white  =  medium,  blue  =  low  expression).  Batch  effects  by  platform  can  be  seen  (black  rectangle),  which  is 
strongly  illustrated  in  the  colorectal  data  (red  rectangle)  where  data  from  the  two  platforms  do  not  merge. 
(B)  Clustered  heat  map  of  the  batch  effects  adjusted  miRNA  data.  Batch  effects  by  platform  have  been 
mitigated  and  the  platform  types  can  be  seen  to  merge  together  (black  and  red  rectangles). 


Specific  Aim  2,  Major  Tasks  5,  6  and  7:  ‘VonMlMiWMlMTalysis  ofthe  ParhGj  data  sets.  ’’“Publish 

Pan-GI  cancer  results and  “Identification  and  publication  OL  potential  therapeutic  targets  in 

gastric  cancer.  ” 
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Most  of  those  tasks  have  been  completed.  One  manuscript  has  been  published  in  Cancer  Cell, 
another  is  currently  under  review  by  Cell  and  Cancer  Cell  (first  round  of  reviews  have  been 
received  and  they  look  encouraging),  whereas  a  third  manuscript  is  currently  being  prepared  and 
scheduled  to  be  submitted  to  Cancer  Cell  in  early  November  (see  Appendices  for  references).  Dr. 
Akbani  is  the  corresponding  author  on  the  last  one.  Highlights  of  the  major  gastric  cancer  related 
findings  in  each  of  the  manuscripts,  respectively,  are: 


1)  A  Pan-Cancer  Proteogenomic  Atlas  of  PI3K/AKT/mTOR  Pathway  Alterations  (Cancer  Cell) 


i.  PI3K/AKT/mTOR  pathway  is  disrupted  in  gastric  cancer  (Fig.  3). 

ii.  PIK3CA  is  the  most  mutated  gene  in  that  pathway,  with  approximately  20%  of  the  gastric 
samples  having  mutations  in  it  (Fig.  3). 

iii.  DEPTOR,  PIK3CA  and  RICTOR  are  the  most  frequently  amplified  genes  with 
approximately  5%  of  the  gastric  samples  having  amplifications  in  them  (Fig.  3). 

iv.  PTEN  is  the  most  frequently  deleted  gene  in  that  pathway,  with  approximately  5%  of  the 
gastric  samples  having  deletions  in  them  (Fig.  3). 

v.  Gastric  cancer  (STAD)  has  intermediate  level  of  activities  of  the  PI3K/AKT  and  mTOR 
pathways  compared  to  other  cancers  (Fig.  4B). 

vi.  It  has  high  levels  of  phospho-mTOR,  phospho-S6,  and  phospho-4EBP1  proteins,  hinting 
at  potential  targets  for  therapy  (Fig.  4A). 


2  ty  o  t-  b,  8-  29-  m  *  p  u 


— ■ — ■ 


n  Hvfl1 

■I 1 


L" 


amplification  copy  loss 

(low-level+mut.  or  high-level) 


Fig.  3.  (Adapted  from  Fig.  2B  in  the  paper.)  By  cancer  type,  percentages  of  somatic  mutation  or  copy 
alteration  for  each  indicated  gene.  Amplification  denotes  “high-level”  copy  gain.  Numbers  of  cases 
denote  representation  on  Whole  Exome  Sequencing  data  platform. 
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Fig.  4.  (Adapted  from  Fig.  1 A-B  in  the  paper).  Proteomic  Signatures  of  PI3K/AKT  and  mTOR  across  Human 
Cancers  (A)  Heatmap  of  RPPA  features  considered  core  to  either  PI3K/AKT  or  mTOR  pathways  across 
7,663  cancers.  Red,  higher  expression  (values  normalized  to  SDs  from  the  median  across  all  cancers); 
blue,  lower  expression.  PI3K/AKT  and  mTOR  features  were  each  summarized  into  pathway  activity  scores 
for  each  tumor  profile  (yellow,  higher  inferred  activity;  blue,  lower  activity;  bright  yellow/blue  denotes  change 
of  1  SD  or  SD,  from  the  median).  Cancer  types  (denoted  by  TCGA  project  name)  are  ordered  by  low  to 
high  average  mTOR  pathway  score.  (B)  Boxplots  of  PI3K/AKT  (top)  and  mTOR  (bottom)  pathway  activities 
scores,  as  inferred  using  RPPA  data.  Boxplots  represent  5%,  25%,  50%,  75%,  and  95%. 

2)  Comparative  Molecular  Analysis  of  Gastrointestinal  Adenocarcinomas  (under  review  by 
Cell  and  Cancer  Cell) 

i)  FBXW7,  SMAD2,  SOX9,  MUC6  and  ZFP36L2  are  some  of  the  genes  that  are 
significantly  mutated  in  Gl  cancers,  but  not  in  non-GI  cancers  (Fig.  5A). 

ii)  KRAS,  GATA6,  CDK6  and  GATA4  are  some  of  the  genes  that  are  significantly 
amplified  in  Gl  cancers,  but  not  in  non-GI  cancers  (Fig.  5B). 

iii)  GMDS,  PARK2,  and  SMAD4  are  some  of  the  genes  that  are  significantly  deleted  in  Gl 
cancers,  but  not  in  non-GI  cancers  (Fig.  5B). 

iv)  Proposed  5  novel  Pan-GI  subtypes;  EBV+,  hypermutated  SNV,  hypermutated  indel, 
chromosomal  instability  (CIN),  genomically  stable  (GS)  (Fig.  6A). 

v)  EBV+  is  only  found  in  gastric  cancer.  It  also  has  a  smaller  proportion  of  CIN  subtype 
compared  to  other  Gl  cancers  (Fig.  6B-C). 

vi)  Gastric  cancer  has  hypomethylation  and  fewer  CpG  island  methylator  phenotype 
(CIMP)  samples  than  other  Gl  cancers  (Fig.  6C). 

vii)  The  upper  Gl  tract  has  more  AA  >  AC  mutations  that  the  lower  Gl  tract  (Fig.  6C). 

viii)  RTK/RAS/PI3K,  TP53/Cell  cycle,  TGF-beta,  and  WNT  pathways  are  all  genomically 
altered  in  Gl  cancers,  providing  potential  avenues  for  targeted  therapy  (Fig.  7). 
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A  Significantly  mutated  genes 


GIAC  [— log10(qr  value)] 


B 


Amplifications 


Deletions 


GIAC  [-log10(Q  value)] 


Fig.  5.  Genomic  features  of  gastrointestinal  adenocarcinomas.  (A)  Quantile-quantile  (Q-Q)  plot  of 
significantly  mutated  genes  in  gastrointestinal  adenocarcinomas  (GIAC)  (horizontal  axis)  compared  to  other 
adenocarcinomas  (non-GI  AC)  (vertical  axis).  Significantly  mutated  genes  unique  to  GIAC  are  marked 
green  flanking  the  horizontal  axis,  those  unique  to  other  adenocarcinomas  are  marked  red  bordering  the 
vertical  axis,  and  common  to  both  are  marked  yellow  scattered  along  the  diagonal.  (B)  Q-Q  plot  of  significant 
focal  amplifications  (left)  and  deletions  (right)  of  GIAC  compared  to  non-GI  AC. 
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Fig.  6.  Molecular  subtypes  of  gastrointestinal  adenocarcinomas.  (A)  Flowchart  depicting  the  classification 
of  GIAC  molecular  subtypes:  Epstein-Barr  virus  (EBV)-positive  (red);  hypermutated-single-nucleotide 
variant  predominant  (HM-SNV)  (gold);  hypermutated-insertion/deletion  predominant  (HM-IND)  (blue); 
chromosomal  instability  (CIN)  (purple);  and  genomically  stable  (GS)  (green).  (B)  molecular  subtypes  among 
Gl  cancers  across  the  gastrointestinal  tract  represented  by  percentage  per  anatomic  region.  (C)  Schematic 
summarizing  key  molecular  features  of  Gl  cancers. 
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Fig.  7.  Integrated  molecular  comparison  of  somatic  alterations  across  Gl  molecular  subtypes.  A  comparison 
of  somatic  disruptions  across  molecular  subtypes  as  indicated  by  somatic  mutations  and  copy-number 
alterations  in  select  genes  associated  with  functional  pathway  groups.  The  lines  and  arrows  within  the 
functional  pathways  show  pairwise  molecular  interactions.  Deep  deletions  marked  in  blue  indicate  loss  of 
more  than  half  of  gene  copies.  Amplifications  are  marked  in  red.  Only  missense  mutations  reported  in  the 
COSMIC  repository  are  included  and  indicated  by  boxes  partially  filled  with  green.  Boxes  partially  filled  with 
black  indicate  nonsense  or  frameshift  mutations.  Alteration  frequencies  for  each  gene  are  listed  inside 
rounded  rectangles  divided  by  molecular  subtype,  with  red  shading  denoting  gene  activation,  and  blue 
denoting  inactivation.  Percentage  of  somatic  alteration  is  indicated  by  numbers  to  the  left  of  each  gene  box 
and  divided  by  upper  (U)  and  lower  Gl  (L)  (A)  Mutations  and  SCNAs  for  selected  genes  associated  with 
RTK  mitogen  signaling  pathways.  (B)  Mutations,  SCNAs,  and  epigenetic  silencing  in  selected  genes 
associated  with  TP53/cell  cycle  pathways.  (C)  Mutations  and  SCNAs  for  selected  genes  associated  with 
developmental  pathways. 

3)  A  Pan-Cancer  atlas  of  genomic,  epigenomic  and  transcriptomic  alterations  in  the  TGF- 
beta  pathway  (scheduled  to  be  submitted  to  Cancer  Cell  in  early  November,  201 7) 

i)  TGF-beta  pathway  is  disrupted  in  gastric  cancer  (Fig.  8). 

ii)  The  most  frequently  amplified  TGF-beta  pathway  genes  are  ACVR2A  (1 3%),  BMPR2 
(10%),  SMAD4  (9%),  and  SPTBN1  (8%)  (Fig.  8). 
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iii)  The  most  frequently  deleted  TGF-beta  pathway  gene  is  SMAD4  (6%)  (Fig.  8). 

iv)  Mutation  frequencies  are  not  very  high,  with  BMP7  having  the  highest  frequency  of 
3%  (Fig.  8). 

v)  Hotspot  mutations  have  been  identified  in  ACVR2A,  BMPR2,  and  SMAD4  in  gastric 
cancer  (Fig.  9). 

vi)  Statistically  significant  positive  correlations  have  been  found  in  gastric  cancer  between 
TGF-beta  pathway  activity  and  the  hormone  receptor,  breast  reactive,  EMT,  immune, 
hormone  signaling,  and  PI3K/AKT  pathways,  whereas  negative  correlations  have 
been  found  with  cell  cycle,  DNA  damage  response  and  apoptosis  pathways  (Fig.  10). 

vii)  TGF-beta  pathway  is  highly  regulated  by  epigenetics  in  gastric  cancer  compared  to 
other  cancers  (Fig.  11).  The  box  plot  shows  large  dynamic  range  for  STAD  (and  also 
DLBC)  compared  to  others. 

Stomach  adenocarcinoma 


Fig.  8.  Alterations  in  various  TGF-beta  pathway  genes  in  gastric  cancer.  Percentages  represent  fraction  of 
samples  in  the  cohort  with  the  given  aberrations  present;  left  box  -  mutations,  middle  box  -  deletions,  right 
box  -  amplifications. 
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Fig.  9.  (A)  Lollipop  plots  showing  mutation  frequencies  (y-axis)  along  the  gene  loci.  Hotspot  mutations  have 
long  lollipop  stems.  (B)  The  hotspots  occur  at  higher  frequencies  in  gastroesophageal  and  colorectal 
cancers  than  other  cancers. 
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Fig.  10.  Pearson’s  correlation  coefficients  between  TGF-beta  pathway  activity  and  the  activity  of  12  other 
pathways  (rows)  across  33  disease  types  (columns),  including  gastric  cancer  (STAD). 
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Methylation  of  TGF  Pathway  genes 


Fig.  11.  Box  plot  showing  median  DNA  methylation  levels  across  the  TGF-beta  pathway  genes  (y-axis)  vs. 
tumor  types.  STAD  (gastric)  and  DLBC  (B-cell  lymphoma)  have  much  larger  dynamic  ranges  compared  to 
other  tumor  types,  potentially  indicating  a  strong  regulatory  role  of  epigenetics  in  those  diseases. 

Tasks  that  will  be  performed  in  future  under  those  Specific  Aims 

1 )  Remaining  two  of  the  three  manuscripts  described  above  will  be  published. 

2)  While  we  have  studied  several  pathways  already  (TGF-beta,  PI3K,  RTK/RAS,  mTOR, 
WNT,  cell  cycle),  several  other  pathways  will  be  studied  in  future  (e.g.  DNA  damage 
response,  TP53,  immune  etc.). 

3)  So  far,  we’ve  analyzed  gastric  cancer  in  the  context  of  other  cancers,  as  mentioned  in 
Specific  Aims  2  and  3.  Going  forward,  we  will  focus  explicitly  on  gastric  cancer,  in 
accordance  with  Specific  Aim  1 . 

What  opportunities  for  training  and  professional  development  has  the  project  provided? 

Rehan  Akbani,  PhD  (PI):  The  project  has  allowed  Dr.  Akbani  several  opportunities  for 
professional  development.  He  works  with  his  mentor,  Dr.  Jaffer  Ajani  on  a  regular  basis  and 
updates  him  on  progress.  Dr.  Ajani,  in  turn,  guides  Dr.  Akbani’s  research.  Dr.  Ajani  has  also  setup 
a  larger  group  with  about  a  dozen  researchers  who  are  working  on  gastric  and  other  Gl  cancers. 
Dr.  Akbani  is  part  of  the  group  and  he  has  been  afforded  the  opportunity  to  collaborate  with  those 
experts.  The  larger  group  meets  on  a  monthly  basis  where  researchers  take  turns  in  presenting 
their  work  and  receive  feedback.  Dr.  Akbani  has  also  presented  in  that  group. 

Dr.  Akbani  is  also  one  of  the  central  members  of  The  Cancer  Genome  Atlas  (TCGA)  project. 
TCGA  funding  ended  in  July,  2016,  but  the  consortium  continues  to  work  on  PanCanAtlas  and 
other  projects.  The  funding  from  this  grant  has  allowed  Dr.  Akbani  to  continue  collaborating  with 
TCGA  on  projects  related  to  this  grant,  such  as  Pan-GI  cancers,  PI3K/AKT  pathway  disruptions, 
TGF-beta  pathway  disruptions,  etc.  In  some  of  them  (e.g.  TGF-beta)  Dr.  Akbani  plays  a  leading 
role.  That  would  not  have  been  possible  without  the  kind  of  funding  provided  by  this  grant. 

Dr.  Akbani  presented  his  research  in  a  TCGA  symposium  on  Nov.  17,  2016,  that  was  attended 
by  several  hundred  people.  As  more  of  his  research  gets  published,  he  plans  to  present  his  results 
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at  more  conferences  and  symposia,  including  a  TCGA  symposium  planned  in  Washington  DC  in 
September  2018. 

Shiyun  Ling,  PhD  (postdoctoral  fellow):  Dr.  Ling  was  Dr.  Akbani’s  postdoctoral  fellow.  Under 
Dr.  Akbani’s  guidance  and  supervision,  Dr.  Ling  gained  hands-on  experience  in  performing  quality 
control  and  batch  effects  correction  of  several  “omics”  data  sets  from  TCGA,  as  stated  in  Major 
Task  4  in  the  approved  SOW.  Those  data  sets  are  now  being  used  by  TCGA  for  virtually  all  of 
their  PanCanAtlas  projects  that  are  currently  underway  (approximately  2  dozen).  After  doing  a 
superb  job  on  the  data,  Dr.  Ling  accepted  a  permanent  position  as  a  Senior  Statistical  Analyst  at 
Mount  Sinai  hospital  in  Connecticut  in  Spring  2017. 

Apurva  Hegde,  MS  (research  assistant):  Ms.  Hegde  was  Dr.  Akbani’s  research  assistant.  She 
performed  the  analysis  stated  in  Major  Task  5  under  Dr.  Akbani’s  supervision  and  guidance.  The 
figures  she  generated  have  been  included  in  the  Pan-GI  manuscript  that  has  been  submitted  for 
review  to  Cell  and  Cancer  Cell.  She  eventually  accepted  a  permanent  position  as  an  Associate 
Bioinformatician  at  the  Translational  Genomics  Research  Institute  (TGen)  in  Phoenix,  AZ  in 
Spring  2017. 

How  were  the  results  disseminated  to  communities  of  interest? 

Some  of  the  research  has  already  been  published  in  renowned  journals  (see  Products  section). 
Other  manuscripts  are  either  currently  under  review,  or  in  preparation.  Besides  publications,  Dr. 
Akbani  has  presented  the  results  of  his  research  in  a  TCGA  symposium  in  Nov.  2016  that  was 
widely  attended.  As  the  research  matures  further,  Dr.  Akbani  plans  to  attend  more  conferences 
and  symposia  to  present  his  results. 

What  do  you  plan  to  do  during  the  next  reporting  period  to  accomplish  the  goals? 

1)  Perform  additional  research  by  following  the  steps  highlighted  in  the  revised  SOW  (see 
Changes  section). 

2)  Publish  the  results  of  the  research  in  renowned  journals. 

3)  Present  the  results  at  widely  attended  conferences  and  symposia. 

4)  Participate  in  cancer  conferences  like  AACR  and  ASCO  to  improve  knowledge  of  gastric 
cancer. 

5)  Collaborate  with  Dr.  Ajani  and  his  team  of  experts,  and  solicit  regular  feedback  about  the 
research. 

6)  Study  the  latest  literature  in  gastric  cancer  to  keep  abreast  of  new  developments. 

IMPACT 

What  was  the  impact  on  the  development  of  the  principal  discipline(s)  of  the  project? 

Since  this  is  the  first  year  of  the  grant  and  only  one  paper  has  been  published  so  far,  the  impact 
of  the  research  is  difficult  to  gauge  at  this  point.  Two  more  publications  are  underway.  It  is 
expected  that  the  papers  will  eventually  be  highly  cited  because,  (i)  they  are  expected  to  be 
published  in  the  high-quality  Cell  family  of  journals,  (ii)  they  are  part  of  the  high-profile  TCGA 
PanCanAtlas  project.  Ultimately,  the  hope  is  that  those  and  other  publications  that  result  from  this 
grant  will  provide  new  targets  for  therapy  in  gastric  cancer  and  increase  our  knowledge  of  the 
disease. 
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What  was  the  impact  on  other  disciplines? 

Nothing  to  report. 

What  was  the  impact  on  technology  transfer? 

Nothing  to  report. 

What  was  the  impact  on  society  beyond  science  and  technology? 

Nothing  to  report. 

CHANGES/PROBLEMS 

The  PI  is  a  part  of  the  TCGA  consortium.  TCGA  had  decided  to  move  forward  with  Pan- 
Gastrointestinal  (Pan-GI)  analysis  last  year  and  Dr.  Akbani  is  a  major  contributor  to  the  project. 
In  fact,  the  Pan-GI  paper  is  currently  under  review  by  Cell  and  Cancer  Cell.  In  the  original  grant 
application  and  SOW,  we  had  mentioned  working  on  the  Pan-GI  project  as  part  of  years  2  and  3 
goals.  However,  because  of  the  accelerated  timeline  from  TCGA’s  leadership,  we  ended  up 
working  on  it  in  year  1 .  Year  1  goals,  in  turn,  will  be  performed  in  year  2.  Therefore,  although  the 
goals  and  level  of  effort  are  still  the  same  as  those  listed  in  the  original  SOW,  they  have  been 
reordered  chronologically.  The  new  SOW  is  provided  in  the  appendix. 

PRODUCTS 

Publications,  conference  papers,  and  presentations 
Journal  publications  (see  appendix  for  details): 

1)  Zhang  Y,  Kwok-Shing  Ng  P,  Kucherlapati  M,  Chen  F,  Liu  Y,  Tsang  YH,  de  Velasco  G, 
Jeong  KJ,  Akbani  R,  Hadjipanayis  A,  Pantazi  A,  Bristow  CA,  Lee  E,  Mahadeshwar  HS, 
Tang  J,  Zhang  J,  Yang  L,  Seth  S,  Lee  S,  Ren  X,  Song  X,  Sun  H,  Seidman  J,  Luquette  LJ, 
Xi  R,  Chin  L,  Protopopov  A,  Westbrook  TF,  Shelley  CS,  Choueiri  TK,  Ittmann  M,  Van 
Waes  C,  Weinstein  JN,  Liang  H,  Henske  EP,  Godwin  AK,  Park  PJ,  Kucherlapati  R,  Scott 
KL,  Mills  GB,  Kwiatkowski  DJ,  Creighton  CJ.  A  Pan-Cancer  Proteogenomic  Atlas  of 
PI3K/AKT/mTOR  Pathway  Alterations.  Cancer  Cell.  2017  Jun  12;31(6):820-832.e3. 
PMID:  28528867 

2)  Yang  Liu,  Nilay  S.  Sethi,  Toshinori  Hinoue  Barbara  G  Schneider  Andrew  D.  Cherniack, 
Francisco  Sanchez-Vega,  Jose  A.  Seoane,  Reanne  Bowlby,  Mirazul  Islam,  Jaegil  Kim, 
Walid  Chatila,  Farshad  Farshidfar,  Rehan  Akbani,  Rupa  S.  Kanchi,  Charles  S.  Rabkin, 
Joseph  E.  Willis,  Kenneth  K.  Wang,  Shannon  J.  McCall,  Lopa  Mishra,  Alexander  J.  Lazar, 
The  Cancer  Genome  Atlas  Research  Network,  Vesteinn  Thorsson,  Adam  J.  Bass,  Peter 
W.  Laird.  Comparative  Molecular  Analysis  of  Gastrointestinal  Adenocarcinomas.  Under 
review  by  Cell  and  Cancer  Cell. 

3)  Anil  Korkut,  Sobia  Zaidi,  Rupa  Kanchi,  Ashton  C.  Berger,  Gordon  Robertson,  Lawrence  N 
Kwong,  Mike  Datto,  Jason  Roszik,  Shiyun  Ling,  Visweswaran  Ravikumar,  Ganiraju 
Manyam,  Arvind  Rao,  Simon  Shelley,  Yuexin  Liu,  Zhenlin  Ju,  Donna  Hansel,  Guillermo 
de  Velasco,  Arjun  Pennathur,  Jesper  B.  Andersen,  Colm  J.  O'Rourke,  Simon  Shelley, 
Kazu  Ohshiro,  Wilma  Jogunoori,  Nancy  R.  Gough,  Shulin  Li,  Hatice  Osmanbeyoglu, 
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Andres  Houseman,  Shuyun  Rao,  Maciej  Wiznerowicz,  Jian  Chen,  Shoujun  Gu,  Wencai 
Ma,  Jiexin  Zhang,  Pan  Tong,  Andrew  D.  Cherniack,  Chuxia  Deng,  Linda  Resar-Smith,  The 
Cancer  Genome  Atlas  Research  Network,  Lopa  Mishra,  Rehan  Akbani.  A  Pan-Cancer 
atlas  of  genomic,  epigenomic  and  transcriptomic  alterations  in  the  TGF-beta  pathway. 
Manuscript  scheduled  to  be  submitted  to  Cancer  Cell  in  early  November,  2017. 


Conference  presentations: 

1)  A  Pan-Cancer  analysis  of  TGF-beta  pathway  aberrations,  presented  by  Rehan  Akbani. 
TCGA  PanCancerAtlas  symposium,  Houston,  TX,  Nov  17,  2016. 
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What  individuals  have  worked  on  the  project? 


Name 

Rehan  Akbani 

Project  Role 

PI 

Research  Identifier  (e.g.  ORCID  ID) 

Nearest  person  month  worked 

3.6 

Contribution  to  Project 

Led  the  project  as  PI.  Performed  analysis. 

Supervised  the  work  of  others. 

Funding  Support 

Name 

Shiyun  Ling 

Project  Role 

Postdoctoral  Fellow 

Research  Identifier  (e.g.  ORCID  ID) 

Nearest  person  month  worked 

6 

Contribution  to  Project 

Performed  analysis  under  the  direction  of  the  PI 

Funding  Support 

Name 

Apurva  Hegde 

Project  Role 

Research  Assistant  II 

Research  Identifier  (e.g.  ORCID  ID) 

Nearest  person  month  worked 

6 

Contribution  to  Project 

Performed  analysis  under  the  direction  of  the  PI 

Funding  Support 

Has  there  been  a  change  in  the  active  other  support  of  the  PD/PI(s)  or  senior/key 
personnel  since  the  last  reporting  period? 

Yes,  funding  support  for  the  PI  has  changed.  The  updated  support  is  as  follows  (excluding  the 
current  grant). 


Ongoing  Research  Support 

1U24CA21 0950-01  (Liang  and  Mills)  7/1/2016-6/30/2021  1.44  calendar 

NIH/NCI  TCPA:  an  Integrated  Bioinformatics  Resource  for  Functional  Cancer  Proteomic  Data 
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Major  goals:  To  expand  the  scope  of  TCPA  by  adding  new  functionalities  and  datasets,  and  to 
enhance  and  improve  its  existing  analytic  capabilities. 

1U24CA2 10949-01  (Weinstein,  Akbani,  Mills)  9/1/2016-8/31/2021  2.4  calendar 

NIH/NCI 

Batch  effects  in  molecular  profiling  data  on  cancers:  detection,  quantitation,  interpretation,  and 
correction 

Major  goals:  The  primary  goal  is  to  analyze  cancer  data  for  batch  effects  for  various  projects 
specified  by  NCI’s  Center  for  Cancer  Genomics  (CCG).  The  data  will  be  checked  for  batch 
effects,  which  will  be  quantified  and  the  data  corrected  if  needed.  A  secondary  goal  is  to  further 
enhance  the  batch  effects  analysis  with  new  algorithms  and  better  quality  control  algorithms. 

5  3P30  CA0 16672  (Dmitrovsky)  7/1/2015-6/30/2018  0.36  calendar 

NIH/NCI 

Bioinformatics  Shared  Resources  (PP-SR22) 

Major  goals:  To  assist  researchers  in  the  application  of  state-of-the-art  methodology  for  the 
development,  conduct,  and  analysis  of  studies  using  high-throughput  technologies. 

1U24CA1 99461  01  (Weinstein  and  9/1/2015-8/31/2020  0.48  calendar 

Broom) 

NIH/NCI  "Next  Generation"  Clustered  Heat  Maps  for  Fluent,  Interactive  Exploration  of  Omic 
Data 

Major  goals:  Expand  and  enhance  the  capabilities  of  the  NG-CHM  system.  Extend  and  enhance 
the  graphical  NG-CHM  builder.  Improve  the  interoperability  of  the  NG-CHM  system  and 
integrate  further  with  other  tools,  frameworks,  and  systems.  Create  additional/expanded 
compendia  of  cancer-related  public  datasets.  Actively  promote  the  NG-CHM  system  and 
interact  with  its  user  community. 

1U24CA21 0950-01  (Akbani,  Weinstein,  Mills)  09/01/2016-08/31/2021  3.60  calendar 

NIH/NCI 

Integrated  analysis  of  protein  expression  data  from  the  Reverse  Phase  Protein  Array  (RPPA) 
platform 

The  primary  goal  is  to  analyze  cancer  proteomics  data  from  the  RPPA  platform  for  various 
projects  specified  by  the  NCI. 

Completed  Research  Support 

NIH/NCI  5  U24  CA1 43883  04  Weinstein,  Akbani,  Mills  (PI)  9/29/2009-7/31/2016 
An  Integrative  Pipeline  for  Analysis  &  Translational  Application  of  TCGA  Data  (GDAC) 

The  overall  goal  of  the  TGCA  GDAC  is  to  generate  computational  pipelines  for  automated 
integration  and  analysis  of  the  data  generated  by  the  TGCA  Genome  Characterization  Centers. 

What  other  organizations  were  involved  as  partners? 

Organization  Name:  The  Cancer  Genome  Atlas  (TCGA) 

Location  of  Organization:  NCI/NIH,  Washington  DC 

Partner's  contribution  to  the  project:  Collaboration  (please  note  that  funding  from  TCGA 
completed  on  7/31/2017,  before  the  reporting  period  started  for  this  grant,  so  no  funding  was 
provided  for  this  work  by  TCGA  during  the  reporting  period). 
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APPENDICES 


Updated  SOW  (Chronologically  reordered  so  Specific  Aims  2  and  3  occur  before  Specific  Aim 
1 .  Tasks  and  level  of  effort  per  task  remain  the  same.  Please  see  “Changes”  section  for  details.) 


Specific  Aim  2 

Timeline 

(months) 

Site 

Major  Task  4:  Acquisition  and  Quality  Control  of 
Pan-GI  data,  in  preparation  for  computational 
analysis 

Acquire  TCGA  Pan-GI  data  and  convert  them  into  a 
“standardized”  format  suitable  for  computational 
analysis 

1 

Dr.  Akbani 

Assess  and  remove  (if  needed)  batch  effects  from  the 
data,  and  improve  the  quality  of  the  data 

2 

Dr.  Akbani 

Milestone(s)  Achieved:  “Cleaned  up”  TCGA  Pan-GI 
data  ready  for  computational  analysis 

2 

Major  Task  5:  Computational  analysis  of  the  Pan-GI 
data  sets 

Cluster  the  data  sets  and  study  the  results 

3 

Dr.  Akbani,  Dr.  Ajani 

Generate  pathway  activity  scores  for  various  pathways 
across  multiple  data  types,  and  determine  which  ones 
are  likely  disrupted 

4-5 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Correlate  disrupted  pathways  across  multiple  data 
types  (e.g.  transcriptomic,  proteomic,  genomic, 
epigenomic)  and  across  clinical  variables  (e.g. 
histology,  stage,  grade,  outcome)  via  statistical  analysis 

6-7 

Dr.  Akbani,  Dr.  Ajani 

Compare  gastric  with  other  Pan-GI  cancers  and  look  for 
similarities  and  differences 

8 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Milestone(s)  Achieved:  First  round  computational 
analysis  for  Pan-GI  cancers  completed 

8 

Major  Task  6:  Publish  Pan-GI  cancer  results 

Discuss  results  with  collaborators  and  perform  any 
follow  up  analysis 

9-10 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Write  one  or  more  manuscripts  with  input  from 
designated  mentor  and  collaborators 

11-12 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Submit  manuscript(s)  and  wait  for  reviews 

13-14 

Dr.  Akbani 
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Respond  to  reviewers  and  resubmit.  May  repeat 
submission/resubmission  process  with  multiple  journals 
depending  on  where  the  paper(s)  end  up  being 
published. 

Present  results  at  conferences. 

15-16 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Milestone(s)  Achieved:  Pan-GI  manuscript(s) 
published 

17 

Total  time  for  Specific  Aim  2 

17 

Specific  Aim  3  (interspersed  timeline) 

Major  Task  7:  Identification  and  publication  of 
potential  therapeutic  targets  in  gastric  cancer 

Identify  potential  genes  and/or  pathways  in  gastric 
cancer  for  targeted  therapy,  using  gastric  data  only  from 
Aim  1 

26-27 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Identify  potential  genes  and/or  pathways  in  gastric 
cancer  for  targeted  therapy,  using  cross-tumor 
information  from  Pan-GI  cancers  from  Aim  2 

9-10 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Integrate  results  into  manuscripts  for  Specific  Aims  1 
and  2.  Present  results  at  conferences. 

11-17 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Milestone(s)  Achieved:  Potential  therapeutic  targets 
identified  and  published 

27 

Total  time  for  Specific  Aim  3  (interspersed  with 
other  aims;  not  consecutive  months) 

27 

Specific  Aim  1 

Major  Task  1:  Acquisition  and  Quality  Control  of 
gastric  cancer  data,  in  preparation  for 
computational  analysis 

Months 

Acquire  TCGA  gastric  cancer  data,  and  in-house  MD 
Anderson  data  (after  procuring  necessary  approvals) 

17.5 

Dr.  Akbani,  Dr.  Ajani 

Convert  all  acquired  data  into  a  “standardized”  format 
suitable  for  computational  analysis 

18 

Dr.  Akbani 

Assess  and  remove  (if  needed)  batch  effects  from  within 
TCGA  and  within  MD  Anderson  data,  and  improve  the 
overall  quality  of  each  data  set  individually. 

19 

Dr.  Akbani 

Merge  TCGA  data  with  MD  Anderson  data,  removing 
batch  effects  across  both  data  sets. 

19.5 

Dr.  Akbani 

22 


Re-assess  the  quality  of  the  overall  data  and  iterate  back 
to  previous  steps,  if  needed,  until  data  are  satisfactory 

20 

Dr.  Akbani,  Dr.  Ajani 

Milestone(s)  Achieved:  “Cleaned  up”  gastric  cancer 
data  from  TCGA  and  MD  Anderson  ready  for 
computational  analysis 

20 

Major  Task  2:  Computational  analysis  of  the  gastric 
cancer  data  sets 

Cluster  the  data  sets  and  study  the  results 

21 

Dr.  Akbani,  Dr.  Ajani 

Generate  pathway  activity  scores  for  various  pathways 
across  multiple  data  types,  and  determine  which  ones 
are  likely  disrupted 

22-23 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Correlate  disrupted  pathways  across  multiple  data 
types  (e.g.  transcriptomic,  proteomic,  genomic, 
epigenomic)  and  across  clinical  variables  (e.g. 
histology,  stage,  grade,  outcome)  via  statistical  analysis 

24-25 

Dr.  Akbani,  Dr.  Ajani 

Milestone(s)  Achieved:  Computational  analysis  for 
gastric  cancer  completed 

25 

Major  Task  3:  Publish  gastric  cancer  results 

Discuss  results  with  collaborators  and  perform  any 
follow  up  analysis 

26-27 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Write  one  or  more  manuscript(s)  with  input  from 
designated  mentor  and  collaborators 

28-29 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Submit  manuscript(s)  and  wait  for  reviews. 

30-31 

Dr.  Akbani 

Respond  to  reviewers  and  resubmit.  May  repeat 
submission/resubmission  process  with  multiple  journals 
depending  on  where  the  paper(s)  end  up  being 
published. 

Present  results  at  conferences. 

32-36 

Dr.  Akbani,  Dr.  Ajani,  Dr. 
Weinstein,  Dr.  Hofstetter 

Milestone(s)  Achieved:  Manuscript(s)  published 

36 

Total  time  for  Specific  Aim  1 

36 
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Publications 


Manuscript  under  review  by  Cell  and  Cancer  Cell.  Abstract  provided  for  reference  only.  Full 
manuscript  available  upon  request. 

Comparative  Molecular  Analysis  of  Gastrointestinal  Adenocarcinomas 

Yang  Liu,  Nilay  S.  Sethi,  Toshinori  Hinoue  Barbara  G  Schneider  Andrew  D.  Cherniack,  Francisco 
Sanchez-Vega,  Jose  A.  Seoane,  Reanne  Bowlby,  Mirazul  Islam,  Jaegil  Kim,  Walid  Chatila, 
Farshad  Farshidfar,  Rehan  Akbani,  Rupa  S.  Kanchi,  Charles  S.  Rabkin,  Joseph  E.  Willis, 
Kenneth  K.  Wang,  Shannon  J.  McCall,  Lopa  Mishra,  Alexander  J.  Lazar,  The  Cancer  Genome 
Atlas  Research  Network,  Vesteinn  Thorsson,  Adam  J.  Bass,  Peter  W.  Laird. 

SUMMARY 

We  analyzed  921  adenocarcinomas  of  the  esophagus,  stomach,  colon  and  rectum  to  explore  the 
shared  and  distinguishing  molecular  characteristics  of  gastrointestinal  tract  adenocarcinomas 
(GIAC),  probing  beyond  tissue-specific  and  anatomic  boundaries.  We  found  that  hypermutated 
(HM)  tumors  were  molecularly  distinct  regardless  of  cancer  type,  and  could  be  subdivided  into 
those  enriched  for  insertions/deletions  (HM-IND),  representing  MSI-H  cases  with  epigenetic 
silencing  of  MLH1  in  the  context  of  CpG  Island  Methylator  Phenotype  (CIMP),  versus  tumors  with 
elevated  single  nucleotide  variants  (HM-SNV)  associated  with  mutations  in  POLE.  Tumors  with 
chromosomal  instability  (CIN)  displayed  more  diversity,  with  gastroesophageal  adenocarcinomas 
harboring  more  fragmented  genomes  associated  with  genomic  doubling  and  distinct  mutational 
signatures.  We  identified  a  group  of  tumors  in  the  colon  and  rectum  lacking  hypermutation  and 
aneuploidy  termed  Genome  Stable  (GS)  and  enriched  in  DNA  hypermethylation  and  mutations  in 
KRAS,  SOX9  and  PCBP1.  This  comprehensive  analysis  reveals  molecular  underpinnings  of 
GIAC  that  transcend  anatomic  boundaries. 
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Manuscript  to  be  submitted  to  Cancer  Cell  in  early  November,  2017.  Abstract  provided  for 
reference  only.  Full  manuscript  available  upon  request. 

A  Pan-Cancer  atlas  of  genomic,  epigenomic  and  transcriptomic 
alterations  in  the  TGF-beta  pathway. 

Anil  Korkut,  SobiaZaidi,  Rupa  Kanchi,  Ashton  C.  Berger,  Gordon  Robertson,  Lawrence  N  Kwong, 
Mike  Datto,  Jason  Roszik,  Shiyun  Ling,  Visweswaran  Ravikumar,  Ganiraju  Manyam,  Arvind  Rao, 
Simon  Shelley,  Yuexin  Liu,  Zhenlin  Ju,  Donna  Hansel,  Guillermo  de  Velasco,  Arjun  Pennathur, 
Jesper  B.  Andersen,  Colm  J.  O'Rourke,  Simon  Shelley,  Kazu  Ohshiro,  Wilma  Jogunoori,  Nancy 
R.  Gough,  Shulin  Li,  Hatice  Osmanbeyoglu,  Andres  Houseman,  Shuyun  Rao,  Maciej 
Wiznerowicz,  Jian  Chen,  Shoujun  Gu,  Wencai  Ma,  Jiexin  Zhang,  Pan  Tong,  Andrew  D. 
Cherniack,  Chuxia  Deng,  Linda  Resar-Smith,  The  Cancer  Genome  Atlas  Research  Network, 
Lopa  Mishra,  Rehan  Akbani. 

Summary 

Here,  we  provide  a  multi-omic  analysis  of  the  transforming  growth  factor  (B  (TGF-P)-Smad  pathway 
in  diverse  human  cancers.  Of  91 25  tumors  representing  33  cancer  types  in  The  Cancer  Genome 
Atlas,  we  discovered  that  41  %  have  at  least  one  genomic  alteration  in  a  core  of  44  TGF-p 
pathway  genes.  The  highest  frequency  occurred  in  gastrointestinal  cancers.  We  identified 
hotspots  in  six  genes,  including  those  encoding  ligands  ( BMP5 ),  receptor  subunits  ( TGFBR2 , 
AVCR2A,  BMPR2),  and  Smads  ( SMAD2 ,  SMAD4).  Transcriptomic  analyses  showed  that 
increased  or  decreased  TGF-p  pathway  activity  stratified  patient  survival  within  some  cancers, 
and  tumor  context  was  a  key  determinant  of  TGF-p  function.  TGF-p  activity  score  also  predicts 
core  pathway  components  that  are  candidates  for  therapeutic  targeting  in  specific  cancers. 
Epigenetic  silencing  and  miRNA  expression,  the  gene  repression  mechanisms,  provided  clues  to 
their  potential  role  in  limiting  TGF-p-Smad  pathway  activity  especially  in  cancers  which  show  low 
pathway  activity  scores. 
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Article 


Cancer  Cell 

A  Pan-Cancer  Proteogenomic  Atlas  of  PI3K/AKT/ 
mTOR  Pathway  Alterations 


Highlights 

•  Multiplatform-based  survey  of  PI3K/AKT/mTOR  across  over 
1 0,000  human  cancers 

•  Distinct  classes  of  somatic  alteration  associated  with  greater 
pathway  activation 


Authors 

Yiqun  Zhang,  Patrick  Kwok-Shing  Ng, 

Melanie  Kucherlapati . 

Gordon  B.  Mills,  David  J.  Kwiatkowski, 
Chad  J.  Creighton 


•  Functional  interrogation  of  specific  mutations  in  PIK3CA  and 
PIK3R1 

•  Support  for  inclusion  of  IDH1  and  VHL  mutations  within  the 
canonical  pathway  model 


Correspondence 

gmills@mdanderson.org  (G.B.M.), 

dkwiatkowski@rics.bwh.harvard.edu 

(D.J.K.), 

creighto@bcm.edu  (C.J.C.) 


In  Brief 

Zhang  et  al.  survey  the  PI3K/AKT/mTOR 
pathway  in  >1 0,000  human  cancers 
across  32  types.  In  addition  to  known 
molecular  events,  some  rare  PIK3CA  and 
PIK3R1  mutations  activate  the  pathway, 
partial  copy  loss  of  PTEN  or  STK1 1  is 
associated  with  poor  patient  survival,  and 
IDH1  or  VHL  mutations  can  confer  mTOR 
activity. 
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SUMMARY 

Molecular  alterations  involving  the  PI3K/AKT/mTOR  pathway  (including  mutation,  copy  number,  protein,  or 
RNA)  were  examined  across  11,219  human  cancers  representing  32  major  types.  Within  specific  mutated 
genes,  frequency,  mutation  hotspot  residues,  in  silico  predictions,  and  functional  assays  were  all  informative 
in  distinguishing  the  subset  of  genetic  variants  more  likely  to  have  functional  relevance.  Multiple  oncogenic 
pathways  including  PI3K/AKT/mTOR  converged  on  similar  sets  of  downstream  transcriptional  targets.  In 
addition  to  mutation,  structural  variations  and  partial  copy  losses  involving  PTEN  and  STK11  showed  evi¬ 
dence  for  having  functional  relevance.  A  substantial  fraction  of  cancers  showed  high  mTOR  pathway  activity 
without  an  associated  canonical  genetic  or  genomic  alteration,  including  cancers  harboring  IDH1  or  VHL  mu¬ 
tations,  suggesting  multiple  mechanisms  for  pathway  activation. 


INTRODUCTION 

The  phosphatidylinositol  3-kinase  (PI3K)/AKT/mammalian  target 
of  rapamycin  (mTOR)  signaling  pathway  is  one  of  the  main 
growth  regulatory  pathways  in  both  normal  cells  and  cancer 

(Hennessy  et  al.,  2005;  Mayer  and  Arteaga,  2016).  This  growth 


pathway  begins  with  class  IA  PI3Ks,  which  are  heterodimers 
consisting  of  pi  1 0  catalytic  and  p85  regulatory  subunits.  Growth 
factor  receptor  tyrosine  kinases  (RTKs)  activate  PI3K  through 
phosphorylation  of  adaptor  proteins  such  as  IRS1/IRS2  (Engel- 
man  et  al.,  2006).  These  adaptor  proteins  bind  the  amino-termi¬ 
nal  domain  of  the  PI3K  p85  regulatory  subunits  through  YXXM 


Significance 

Our  current  model  of  the  PI3K/AKT/mTOR  pathway  has  largely  been  derived  from  experimental  systems.  The  Cancer 
Genome  Atlas  (TOGA)  pan-cancer  cohort  represents  an  opportunity  to  explore  these  pathway  relationships  in  the  setting 
of  human  cancer.  Cause-and-effect  relationships  embodied  by  the  pathway  model  can  manifest  as  correlations  in  human 
disease.  Integration  of  genomic  with  proteomic  data  may  benefit  personalized  and  precision  medicine  approaches  in  help¬ 
ing  to  assess  variants  for  potential  clinical  relevance.  Manifestation  of  pathways  at  the  transcription  level  is  distinct  from 
manifestation  at  the  phospho-protein  level,  highlighting  the  importance  of  proteomic  approaches.  Over  time,  previously  un¬ 
realized  or  underappreciated  members  or  connections  may  be  incorporated  into  the  standard  pathway  model,  where  TOGA 
data  may  aid  in  the  process  of  discovery  or  confirmation. 
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motifs,  to  reverse  its  inhibition  of  the  pi  1 0  catalytic  subunit,  and 
leads  to  movement  of  the  p85-p110  heterodimer  to  the  cell 
membrane  where  pi  10  can  phosphorylate  phosphatidylinosi- 
tol-4,5-bisphosphate  (PIP2)  to  generate  phosphatidylinositol- 
3,4,5-trisphosphate  (PIP3).  RAS  family  members  can  also 
activate  PI3K  (Mayer  and  Arteaga,  2016).  The  primary  negative 
regulator  of  PI3K  activation  is  the  phosphatase  PTEN,  which  de- 
phosphorylates  PIP3  at  the  3'  position  (Keniry  and  Parsons, 
2008),  with  a  secondary  negative  regulator  being  INPP4B 
(inositol  polyphosphate-4-phosphatase  type  II  B).  PIP3  recruits 
several  pleckstrin  homology  domain-containing  proteins  to  the 
membrane,  including  AKT  and  PDK1 .  AKT  is  phosphorylated  at 
Thr308  by  PDK1  and  at  Ser473  by  mTOR  complex  2  (mTORC2), 
which  increases  its  kinase  activity.  AKT  directly  and  indirectly 
phosphorylates  many  downstream  proteins,  including  the 
GSKs,  p27KIP1 ,  FoxO  transcription  factors,  MDM2,  and  BAD, 
to  enhance  cell  survival  and  growth  (Manning  and  Cantley, 
2007).  Furthermore,  AKT  phosphorylates  TSC2  at  multiple  sites, 
to  inhibit  the  GTPase-activating  protein  function  of  the  TSC  pro¬ 
tein  complex  (consisting  of  TSC1 ,  TSC2,  and  TBC1 D7)  toward 
Rheb,  a  RAS  family  member  (Dibble  and  Manning,  2013; 
Laplante  and  Sabatini,  2012).  Rheb-GTP  binds  to  mTOR  com¬ 
plex  1  (mTORCI)  to  activate  its  kinase  activity  toward  the 
S6Ks,  4E-BP1 ,  and  other  substrates,  leading  to  enhancement 
of  multiple  anabolic  biosynthetic  pathways  that  enable  produc¬ 
tion  of  the  building  blocks  (e.g.,  nucleotides)  and  macromole¬ 
cules  (e.g.,  ribosomes)  required  for  cell  size  increase  and  mitosis 
(Dibble  and  Manning,  2013). 

Multiple  genetic  events  have  been  described  that  lead  to 
activation  of  the  PI3K/AKT/mTOR  pathway  in  cancer  (Thorpe 
et  al.,  2015).  Activating  mutations  in  PIK3CA,  which  encodes 
the  PI3K  pi  10a  catalytic  subunit,  are  common  in  many  cancer 
types  (Samuels  et  al.,  2004;  Thorpe  et  al.,  2015).  There  are 
highly  focal  hotspots  of  mutation  in  PIK3CA,  E542,  and  E545 
in  the  helical  domain,  and  HI 047  and  G1049  in  the  kinase 
domain,  which  activate  the  kinase  through  different  mecha¬ 
nisms.  Other  PI3K  pi  10  isoforms  are  rarely  mutated  in  cancer 
overall,  but  PIK3CA  and  PIK3CB,  as  well  as  the  class  II  PI3K 


PIK3C2B,  are  all  amplified  in  one  or  more  cancer  types  (Thorpe 
et  al.,  2015).  PIK3R1,  and  less  commonly  PIK3R2,  which 
encode  the  p85a  and  p85|3  regulatory  subunits  of  PI3K,  are 
commonly  mutated,  resulting  in  reduced  ability  to  inhibit  PI3K 
pi  10a  (Cheung  et  al.,  2011;  Thorpe  et  al.,  2015).  PTEN  is  sub¬ 
ject  to  both  genomic  deletion  and  small  point  mutations  that 
inactivate  its  function,  and  is  one  of  the  most  commonly 
mutated  cancer  genes  overall  (Keniry  and  Parsons,  2008). 
AKT1  is  occasionally  activated  by  mutation  at  a  single  site, 
E17K  (Carpten  et  al.,  2007).  Inactivating  mutations  in  both 
TSC1  and  TSC2  have  been  identified  in  cancer  at  low  frequency 
(Hornigold  et  al.,  1999),  as  well  as  activating  mutations  in  MTOR 
(Grabiner  et  al.,  2014).  RHEB  mutations  are  rare  but  focal  at 
Y35,  suggesting  a  driver  effect. 

With  the  recent  conclusion  of  the  data  generation  phase  of 
The  Cancer  Genome  Atlas  (TCGA),  there  is  opportunity  for  sys¬ 
tematic  analyses  of  the  entire  TCGA  pan-cancer  cohort, 
including  analyses  focusing  on  specific  oncogenic  pathways. 
The  aim  of  our  study  was  to  comprehensively  examine  the 
entire  PI3K/AKT/mTOR  pathway  and  its  components  in  over 
10,000  human  cancers  and  32  cancer  types  profiled  by 
TCGA,  using  multiple  molecular  profiling  platforms,  including 
proteomics. 

RESULTS 

Proteomic  Analysis  of  the  PI3K/AKT/mTOR  Pathway 

Our  study  involved  1 1 ,21 9  human  cancer  cases  representing  32 
different  major  types,  for  which  TCGA  generated  data  on  one  or 
more  of  the  following  molecular  characterization  platforms 
(Table  SI):  whole-exome  sequencing  (WES,  n  =  10,224  cases), 
whole-genome  sequencing  data  (WGS,  n  =  1,363),  somatic 
DNA  copy  by  SNP  array  (n  =  10,845),  RNA  sequencing 
(n  =  10,224),  and  reverse-phase  protein  array  (RPPA).  We  used 
the  RPPA  proteomic  platform  to  analyze  7,663  patient  samples 
from  31  cancer  types  (with  no  data  available  for  AML  patients). 
The  RPPA  dataset  comprised  225  high-quality  antibodies  that 
target  166  total  proteins  and  56  phosphorylated  proteins.  In 
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Figure  1.  Proteomic  Signatures  of  PI3K/AKT 
and  mTOR  across  Human  Cancers 

(A)  Heatmap  of  RPPA  features  considered  core  to 
either  PI3K/AKT  or  mTOR  pathways  across  7,663 
cancers.  Red,  higher  expression  (values  normalized 
to  SDs  from  the  median  across  all  cancers);  blue, 
lower  expression.  PI3K/AKT  and  mTOR  features 
were  each  summarized  into  pathway  activity  scores 
for  each  tumor  profile  (yellow,  higher  inferred  ac¬ 
tivity;  blue,  lower  activity;  bright  yellow/blue  denotes 
change  of  1  SD  or  SD,  from  the  median).  Cancer 
types  (denoted  by  TOGA  project  name)  are  ordered 
by  low  to  high  average  mTOR  pathway  score. 

(B)  Boxplots  of  PI3K/AKT  (top)  and  mTOR  (bottom) 
pathway  activities  scores,  as  inferred  using  RPPA 
data.  Boxplots  represent  5%,  25%,  50%,  75%, 
and  95%. 

(C)  Pearson’s  correlations  between  RPPA  features 
across  all  cancers,  involving  features  core  to  PI3K/ 
AKT  or  mTOR  pathways,  as  well  as  involving  fea¬ 
tures  representing  proteins  that  may  act  peripher¬ 
ally  upon  either  pathway.  See  also  Figure  SI  and 
Tables  SI  and  S2. 
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this  study,  we  carried  out  data  normalization  and  batch  correc¬ 
tion  to  allow  for  direct  comparisons  between  different  cancer 
types.  In  general,  mRNA  levels  were  significantly  correlated 
with  protein  levels,  but  strong  correlations  between  mRNA  and 
phospho-protein  levels  involving  PI3K/AKT/mTOR  pathway 
members  were  not  observed  (Figure  SI  A).  In  this  study,  we  re¬ 
garded  mTOR  signaling  as  a  separate  pathway  from  PI3K/AKT, 
where  the  former  integrates  information  from  the  PI3K/AKT, 
Ras/MAPK,  and  LKB1/AMPK  pathways  (Laplante  and  Sabatini, 
201 2).  Following  previous  studies  (Akbani  et  al.,  201 4),  we  devel¬ 
oped  pathway  signatures  for  both  PI3K/AKT  and  mTOR  compo¬ 
nents,  based  on  member  proteins  selected  by  literature  review, 
as  a  means  of  assessing  the  overall  level  of  pathway  activity 
given  the  variations  of  individual  members. 

For  each  tumor,  RPPA  signatures  for  PI3K/AKT  and  mTOR 
were  summarized  into  activity  scores  (Figure  1 A  and  Table  S2). 


On  average,  mTOR  scores  differed  by 
tumor  lineage,  with,  for  example  KICH  (kid¬ 
ney  chromophobe)  tumors  showing  the 
lowest  levels  of  mTOR  activity,  and  with 
PCPG  (pheochromocytoma  and  paragan¬ 
glioma)  showing  the  highest  levels  (fol¬ 
lowed  by  glioblastoma  multiforme  and 
brain  lower  grade  glioma  (LGG),  or  glio¬ 
blastoma  and  LGG,  respectively);  at  the 
same  time,  within  each  tumor  type  a 
wide  range  of  activity  levels  were  evident 
(Figure  IB).  Across  tumor  profiles,  PI3K/ 
AKT  and  mTOR  activity  scores  were  highly 
significantly  correlated  (Pearson’s  r  =  0.50, 
p  ~  0),  although  many  cancer  cases 
showed  high  mTOR  activity  but  low  PI3K/ 
AKT  activity  or  vice  versa  (Figures  1 A  and 
IB),  indicative  of  a  certain  degree  of 
decoupling  between  the  two  pathway 
branches.  Individual  members  of  the 
PI3K/AKT  signature  were  strongly  correlated  in  protein  expres¬ 
sion  with  each  other  across  cancers  (Figure  1C).  mTOR 
pathway-related  members  were  also  highly  inter-correlated  (Fig¬ 
ure  1C),  with  distinct  clusters  involving  4EBP1-  and  S6-related 
features,  respectively,  and  with  phospho-RICTOR  negatively 
correlated  with  phospho-mTOR  (r  =  -0.14,  p  <  IE-30).  Other 
protein  features  strongly  correlated  with  PI3K/AKT/mTOR 
signaling  included  members  of  the  MAP  Kinase  pathway  (Fig¬ 
ures  SIB  and  SIC).  When  considering  a  number  of  additional 
RPPA  features  for  proteins  understood  to  act  peripherally  on 
PI3K/AKT  or  mTOR  signaling,  these  tended  to  show  weaker  cor¬ 
relations  with  PI3K/AKT  and  mTOR  features  (Figure  1 C).  INPP4B 
and  AMPK  were  negatively  correlated  with  mTOR  activity  as  ex¬ 
pected,  while  within  subsets  of  tumors  other  proteins  would  pre¬ 
sumably  have  pathway-related  roles  that  may  not  be  reflected  in 
more  global  analyses. 
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Figure  2.  Somatic  Mutations  and  DNA  Copy  and  Structural  Alterations  Involving  Components  of  the  PI3K/AKT/mTOR  Pathway  across  Hu¬ 
man  Cancers 

(A)  Diagram  of  somatic  mutation  and  copy-number  alteration  (CNA)  frequencies  involving  components  of  the  PI3K/AKT/mTOR  pathway.  Key  genes  (with  sig¬ 
nificant  or  sizable  frequencies  of  alteration)  are  indicated  by  rectangles,  with  the  percentages  of  somatic  mutations  and  CNAs  shown  in  the  left  and  right  portions 
of  each  rectangle,  respectively.  Significantly  altered  genes  (from  Chang  et  al.,  2016;  Kandoth  et  al.,  2013;  Lawrence  et  al.,  2014;  Zack  et  al.,  2013;  percentages 
representing  significant  alterations  are  underlined)  are  bounded  by  orange  lines.  Red,  potentially  activating  genetic  alterations;  blue,  potentially  inactivating 
genetic  alterations.  Copy  loss  represents  either  “high-level”  deletion  (approximating  homozygous  deletion)  or  mutation  in  combination  with  “low-level”  deletion 
(partial  loss). 

(B)  By  cancer  type,  percentages  of  somatic  mutation  or  copy  alteration  for  each  indicated  gene.  Amplification  denotes  “high-level”  copy  gain.  Numbers  of  cases 
denote  representation  on  WES  data  platform. 

(C)  Genomic  rearrangements  (represented  in  circos  plot)  involving  PTEN,  INPP4B,  STK11,  TSC1 ,  TSC2,  PIK3R1 ,  or  PPP2R1A,  based  on  analysis  of  1 ,363  cases 
with  WGS  data. 

(D)  Left:  alterations  involving  PTEN  (somatic  mutation,  copy  alteration,  structural  variation,  or  SV)  found  in  the  set  of  1 ,093  cancers  cases  having  both  WGS  and 
RPPA  data  available  (protein  values  normalized  to  SDs,  or  SDs,  from  the  median).  Right:  boxplot  of  PTEN  protein  expression  by  alteration  class.  Boxplots 
represent  5%,  25%,  50%,  75%,  and  95%.  p  Values  by  t  test  on  log-transformed  values.  See  also  Figure  S2  and  Tables  S3  and  S4. 


Somatic  DNA  Alterations  Involving  the  PI3K/AKT/mTOR 
Pathway 

We  examined  gene  mutations  (using  WES,  n  =  1 0,224  cases)  and 
somatic  DNA  copy  alterations  (by  SNP  6.0  arrays,  n  =  10,845 
cases),  focusing  on  genes  in  the  canonical  PI3K/AKT/mTOR 
pathway  (Figure  2A  and  Table  S3).  Frequencies  of  somatic  alter¬ 
ation  for  key  genes  in  the  pathway  were  tabulated  across  all 
cancers  as  well  as  within  each  cancer  type  according  to  TCGA 
project  (Figure  2B  and  Table  S4).  A  number  of  genes  in  the 
pathway  were  found  significantly  mutated  or  copy  altered  in 
pan-cancer  analyses  (Chang  et  al.,  2016;  Kandoth  et  al.,  2013; 
Lawrence  et  al.,  2014;  Zack  et  al.,  2013),  including  PIK3CA 


(14%  mutated  across  all  cancers;  6%  amplified),  PTEN  (9% 
mutated;  7%  deletion  or  two-hit  loss),  PIK3R1  (4%  mutated), 
PPP2R1A  (2%  mutated),  AKT1  (1  %  mutated),  AKT1  (3%  ampli¬ 
fied),  TSC1  (2%  mutated),  STK11  (2%  mutated;  1%  deletion  or 
two-hit  loss),  RICTOR  (3%  amplified),  and  MTOR  (4%  amplified). 
With  the  notable  exception  of  AKT3,  copy  number  alterations  of 
PI3K/AKT/mTOR  pathway  member  genes  were  highly  correlated 
with  their  mRNA  expression  (Figure  S2A).  When  overlaid  with 
mutation  frequency  data  from  human  tumors,  the  model  of  the 
PI3K/AKT/mTOR  pathway  (Figure  2A)  can  indicate  which 
pathway  members  or  interactions  may  be  most  relevant  in  the 
context  of  cancer.  However,  even  genes  with  a  low  frequency 


Cancer  Cell  31,  820-832,  June  12,  2017  823 


CelPress 


of  DNA  alterations  (e.g.,  AKTS1 ,  MAPKAP1 ,  MLST8,  PDK1)  may 
be  critical  in  individual  cancer  cases  or  in  specific  cancer  types 
or  subtypes  not  included  here,  in  which  they  may  be  more 
commonly  altered. 

Genomic  rearrangements  represent  another  class  of  somatic 
alterations  impacting  gene  function.  Out  of  1,363  cases  with 
WGS  data  available  (1,218  by  low-pass  sequencing),  63  cases 
(~5%)  harbored  a  rearrangement  within  pathway  suppressor 
genes  PTEN  (39  cases),  INPP4B  (14),  STK11  (5),  TSC1  (2), 
TSC2  (2),  PIK3R1  (2),  or  PPP2R1A  (2)  (Figure  2C).  By  structural 
variation  (S V),  copy  loss  (partial  or  total),  or  mutation,  PTEN 
was  found  altered  in  40%  of  cancers  with  both  RPPA  and 
WGS  data,  with  PTEN  protein  expression  most  impacted  in  tu¬ 
mors  with  SV,  homozygous  loss,  or  nonsense/indel/frameshift 
mutations  (Figure  2D).  In  addition  to  PTEN,  SVs  within  STK11 
and  TSC1  were  also  associated  with  decreased  expression  (Fig¬ 
ure  S2B).  Furthermore,  high-  and  low-level  copy  number  loss  for 
several  pathway  genes  were  strongly  correlated  with  reduced 
mRNA  levels  (Figure  S2B),  and  20  cases  harbored  candidate 
gene  fusions  involving  PIK3CA,  AKT1,  AKT2,  AKT3,  or  MTOR 
(Figure  S2C  and  Table  S3). 

Recurrently  Mutated  Residues  in  Key  Genes  Associated 
with  Protein  Activation 

A  large  proportion  of  mutations  identified  in  driver  genes  that 
activate  PI3K/AKT/mTOR  are  of  low  occurrence,  highlighting 
the  need  to  functionally  annotate  the  long  tail  of  infrequent  muta¬ 
tions  present  in  heterogeneous  cancers  (Dogruluk  et  al.,  2015). 
For  example,  PIK3CA  is  the  gene  most  commonly  activated  by 
mutation  in  the  cancer  genome,  with  mutations  being  most 
frequent  at  positions  E542,  E545,  and  HI 047  (Figure  3A);  on 
the  other  hand,  13%  of  PIK3CA  mutations  observed  occurred 
in  a  single  case  and  showed  no  significant  pattern  of  occurrence. 
Somatic  copy  alteration  represents  another  potential  mecha¬ 
nism  for  altering  gene  function  where,  for  example,  amplification 
of  PIK3CA  impacts  pi  10a  protein  expression  (Figure  3B).  Previ¬ 
ous  pan-cancer  sequence  analyses  (Chang  et  al.,  2016)  have 
identified  recurrent  mutational  hotspots,  where  such  hotspots 
would  presumably  have  greater  impact  on  protein  function.  In 
the  case  of  PIK3CA,  73%  of  somatic,  nonsilent  mutation  variants 
identified  in  TCGA  pan-cancer  cohort  involved  a  hotspot  residue 
as  identified  by  Chang  et  al.,  while  13%  ofPIK3R1  mutations  and 
7%  of  MTOR  mutations  involved  a  hotspot  residue  (Figure  3C).  In 
addition,  algorithms  such  as  Mutation  Assessor  (Reva  et  al., 
201 1)  have  predicted  the  likely  functional  impact  of  somatic  mu¬ 
tation,  e.g.,  based  on  evolutionary  conservation  of  the  affected 
amino  acid  in  protein  homologs. 

As  the  above  genes,  as  well  as  PTEN,  presumably  act  upon 
AKT  (Figure  2A),  phospho-protein  expression  of  AKT  was  exam¬ 
ined  in  relation  to  tumor  groups  as  defined  by  somatic  alteration 
of  a  key  gene  (Figure  3D).  For  each  gene  considered,  mutations 
were  separated  on  the  basis  of  whether  or  not  a  prediction  of  mu¬ 
tation  functionality  could  be  made  (by  residue  hotspot,  by  Muta¬ 
tion  Assessor,  by  manual  literature  review,  or  by  nonsense/ 
frameshift/indel  involving  PTEN  or  PIK3R1).  For  each  of  the 
genes  considered  (AKT  1 ,  MTOR,  PIK3CA,  PIK3R1,  and  PTEN), 
tumors  harboring  mutations  that  were  predicted  to  have  func¬ 
tional  effects  had  elevated  phospho-AKT  levels  on  average, 
compared  with  tumors  that  did  not  harbor  an  alteration;  in  addi¬ 


tion,  tumors  with  mutations  not  predicted  to  be  functional 
showed  either  a  lesser  effect  or  no  significant  effect  on 
phospho-AKT.  PTEN  copy  losses  were  also  associated  with 
AKT  activation,  while,  interestingly,  PIK3CA  amplifications  and 
copy  alterations  involving  other  specific  genes  (Figure  S3) 
were  not. 

In  addition  to  analysis  of  significantly  mutated  residues  and  of 
phospho-protein  expression,  functional  studies  using  cell  lines 
represents  another  way  to  annotate  mutations  in  terms  of  their 
oncogenic  potential.  Using  MCF1 0A  and  Ba/F3  cells,  69  different 
nonsilent  PIK3CA  mutation  variants  were  functionally  assessed 
in  vitro  for  their  activating  potential  (Figures  4A  and  S4;  Table 
S5).  Most  variants  tested  showed  some  level  of  functionality 
(from  weak  to  strong)  in  at  least  one  of  the  two  cell  lines,  while 
14  variants  showed  no  functional  effects  and  two  showed  inhib¬ 
itory  or  inactivating  effects.  The  degree  of  growth  activation  var¬ 
ied  considerably,  with  the  three  highly  recurrent  PIK3CA  site 
(E542,  E545,  and  HI  047)  mutants  showing  some  of  the  highest 
degrees  of  activity  in  this  assay.  In  another  experiment,  35 
different  nonsilent  PIK3R1  mutation  variants  were  functionally 
interrogated  in  Ba/F3  cells  (Figure  4B).  When  the  results  of  the 
functional  studies  were  aligned  with  data  from  TCGA,  a  signifi¬ 
cant  trend  was  observed  for  both  PIK3CA  and  PIK3R1 ,  whereby 
variants  that  were  associated  with  functionality  in  vitro  had  a 
higher  frequency  of  occurrence  in  human  tumors  (Figure  4C), 
suggesting  that  natural  selection  favored  tumor  development 
for  those  variants  with  greater  functional  effects.  Most  variants 
showing  some  functionality  also  had  higher  phospho-AKT  on 
average,  compared  with  tumors  with  the  corresponding  wild- 
type  gene  (Figures  4A  and  4B),  although  variants  associated 
with  higher  phospho-AKT  were  not  necessarily  associated  with 
higher  phospho-TSC2  (downstream  in  the  pathway  from  AKT). 

Transcriptomic  Analysis  of  PI3K/AKT/mTOR  Pathway 

Signaling  pathways  that  influence  cell  growth  transduce  signals 
to  the  nucleus,  leading  to  activation  or  deactivation  of  the  tran¬ 
scription  of  specific  genes  (Hanahan  and  Weinberg,  2000).  Pre¬ 
viously,  we  had  defined  a  PI3K/AKT/mTOR  transcriptional 
(mRNA)  signature,  based  on  the  set  of  genes  either  induced  or 
repressed  by  PI3K  or  mTOR  inhibitors  (Creighton  et  al.,  2010). 
We  applied  this  signature  to  the  Library  of  Integrated  Network- 
based  Cellular  Signatures  (LINCS)  database  (Duan  et  al.,  2014) 
of  perturbational  expression  profiles  across  multiple  cell  and 
perturbation  types.  In  the  LINCS  LI  000  expression  dataset  (con¬ 
sisting  of  ~1 ,000  genes),  the  PI3K/AKT/mTOR  mRNA  signature 
was  inversely  associated  with  the  transcriptional  responses  of 
cell  lines  to  PI3K/AKT/mTOR  inhibitors  (Figure  5A).  We  evaluated 
the  signature  against  the  LINCS  expression  profiles  of  cells 
treated  with  short  hairpin  RNA  (shRNAs)  for  ~6,000  different 
genes;  knock  down  of  pathway  effectors  (e.g.,  MTOR  or  RPTOR) 
resulted  in  gene  signature  patterns  inversely  correlated  to  those 
of  our  PI3K/AKT/mTOR  signature,  while  knock  down  of  pathway 
suppressors  (e.g.,  PTEN  or  INPP4B)  resulted  in  signature  pat¬ 
terns  positively  correlated  with  those  of  our  signature  (Figure  5B 
and  Table  S6).  Notably,  knock  down  of  MYC  and  KRAS  also  sup¬ 
pressed  the  PI3K/AKT/mTOR  signature;  furthermore,  when 
scoring  TCGA  pan-cancer  mRNA  profiles  for  pre-defined  signa¬ 
tures  of  PI3K/AKT/mTOR,  MYC,  and  k-ras,  cancers  scoring  high 
for  PI3K/AKT/mTOR  also  tended  to  score  high  for  MYC  and  k-ras 
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Figure  3.  Distributions  of  Mutations  in  Key 
PI3K/AKT/mTOR  Pathway  Genes  and  Asso¬ 
ciation  with  Protein  Activation 

(A)  PIK3CA  nonsilent,  somatic  variant  frequencies 
and  distribution  across  domain-annotated  pi  10a 
protein  structure.  “Recurrent”  denotes  mutation 
event  observed  in  two  or  more  tumor  cases.  “Hot¬ 
spot”  denotes  recurrently  mutated  residues  as 
identified  by  pan-cancer  sequence  analyses  (Chang 
et  al.,  2016).  “MA”  denotes  “medium”  or  “strong” 
functional  prediction  by  Mutation  Assessor  algo¬ 
rithm  (Reva  et  al.,  2011). 

(B)  Boxplot  of  pi  10a  expression  by  PIK3CA  alter¬ 
ation  class  (gene  amplification,  gain  of  one  to  two 
copies,  mutation,  or  none  of  the  above,  i.e.  “un¬ 
aligned”).  p  Values  by  t  test  on  log-transformed 
values. 

(C)  Distributions  of  nonsilent  and  somatic  variants 
within  PIK3R1  (top)  and  MTOR  (bottom)  across  their 
respective  domain-annotated  protein  structures. 

(D)  Boxplot  of  AKT  pS473  phospho-protein  ex¬ 
pression  by  mutation  (mut.)  or  copy  alteration  class, 
with  the  unaligned  cases  having  none  of  the 
listed  alteration  types,  p.f.,  predicted  functional 
mutations  (by  hotspot,  Mutation  Assessor  analysis, 
literature  review,  or  nonsense/frameshift/indel 
involving  PTEN  or  PIK3R1 );  amp.,  high-level  gene 
amplification;  low-lev.  and  high-lev.,  low-  and  high- 
level  copy  deletions,  respectively,  p  Values  by  t  test 
on  log-transformed  values,  n.s.,  not  significant 
(p  >  0.05).  Boxplots  represent  5%,  25%,  50%,  75%, 
and  95%.  Points  in  boxplots  are  colored  according 
to  tumor  type  as  defined  by  TCGA  project  as  indi¬ 
cated  in  (D).  See  also  Figure  S3. 
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(Figures  5C,  S5A,  and  S5B),  suggesting  that  multiple  oncogenic 
signaling  pathways  may  converge  on  similar  sets  of  transcrip¬ 
tional  targets.  The  above  mRNA  signatures  would  represent 
more  than  cell  proliferation  processes,  given  how  the  signatures 
were  originally  derived  (Creighton  et  al.,  2010),  the  lack  of  cell- 
cycle  regulators  in  the  top  LINCS  shRNA  results  (Figure  5B), 
and  the  signature  association  with  key  alterations  in  human  tu¬ 
mors  (Figure  5C). 
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I  As  a  means  of  identifying  a  transcrip- 

III  tional  signature  associated  with  the 

PI3K/AKT/mTOR  pathway,  we  examined 
datasets  from  Garnett  et  al.  (2012),  which 
included  both  gene  expression  and 
drug  sensitivity  data  for  131  drugs  on  a 
set  of  594  human  cancer  cell  lines.  To 
derive  gene  expression  correlates  of 
sensitivity  to  pathway  inhibition,  half 
maximal  inhibitory  concentration  values 
for  11  different  inhibitors  to  PI3K/AKT/ 
MTOR  were  normalized  and  averaged  to 
obtain  a  single  drug  sensitivity  score 
across  cell  lines.  After  correcting  for 
expression  differences  specific  to  tumor 
type,  146  genes  were  significantly  asso¬ 
ciated  (p  <  0.01,  generalized  linear 
model)  with  pathway  inhibitor  sensitivity 
(Figure  5D  and  Table  S6).  Across  cell  lines,  this  inhibitor  sensi¬ 
tivity  signature  correlated  significantly  with  PI3K/AKT  phospho- 
protein  levels  (Figure  5E),  but  showed  little  overlap  with  the 
above  CMAP  signature  (from  Figure  5A).  Furthermore,  when 
scoring  TCGA  pan-cancer  mRNA  profiles  for  the  above 
signatures,  tumors  that  scored  high  for  the  inhibitor  signature 
tended  to  score  low  for  the  CMAP  signature  and  vice  versa, 
and  PI3K/AKT  proteomic  score  (but  not  mTOR  score)  was 


o 

■a 

CD 

LUAD,  LUSC, 

E 

c 

03 

,ov, 

CD 

03 

PAAD,  PCPG, 

.c 

o 

C 

o 

PRAD, 

5 

,  STAD, 

K 

TGCT,  THCA, 

CL 

HYK  ,  UCEC, 

,  UVM 

Cancer  Cell  31,  820-832,  June  12,  2017  825 


CelPress 


PIK3CA  functional  assay 


TCGA  pan-cancer  associations 


frequency 

AKT  pS473  ■  ■ 

!  ■ 


TSC2  pT1462  I 
hotspot  residue 

PIK3CA 

mutation 


I 


t— ^^^(X>0)oo^^COCoS^^^-DCNO)00-o^^^L0L^OOLOc^^3'L0OOCNIO'D,^COC\lCNlC\lc^CNLOLOLnL^LOLOCDCOCOCOrnT— COOOOt— oI^Ot— LOLOi-(.CO£2l^ 

^  ^  CM  ^  CO  CO  5o  S  °  O  o  °  o  ^  ^  ^  ^  ^  5  ^  ^1- 10  uo  CO  CO  CO  1^  00  CNJ 10  lO  ID  ^1- ^1- ^1- ^ ^1- ^1- Tl- ^1- ^1- ^1- ^1- g  O  CN  ^  ^  o  O  CNJ  CM  CN  CM  g  ^1- ^ 

■^o  —  c^m^S^^T-^^3-3;T-T-^o«j^^^^cococv50o>cy5cooococo^i-Lo^-^tLnLOLOiSLOLOiSiSL^LnLOLOLOLnLo3i^N-oocr>a)^-ooooo^o00 
L1_  —  LL|LUcv5>>^"'r7:7^^''^''n,T>"'r  '>n/f  . . . . .  ■ .  1 1 1 1 1 1 1 . .  1  ■ .  1 1 1 1 - -  - — 

"‘l 


LU  ^ 


I Q  Q  ^  (3  0  LU  O  X  O  ^  O  LU  LU  LU  LU  LU  W  LU  LU  LU  W  LU  LU  O  O  O  O  X  UJ  X  ~ 

"‘l 


j^X^X 

^I^N-OOO 

°0000 


B 


PIK3R1  functional  assay 


BaF3 


TCGA  pan-cancer  associations 
frequency  ■  ■  ■ 

C2I  L  il  'J  ■  .11 

hotspot  residue  ■  ■ 


PIK3R1 

mutation 


d)XLiJ  0  (D  CD  (£>  CD oo 


X"o3  a5islZO>O^OQLiJOC0C0^O 

rvl  ■n  *n  rr*  O  N-  r— »  ttI-  h —  r-t-  CO  ^  rr\ 


S 


>C0Or0^-c°'C3C0?5'CC3C0ra'C3o00'^'O'OL0CNJ'C5'C5CD'::1'CJ5O^1't^;OT:1'^:':j'3!^2OTCJ5 

>^f^co^LOojr--r--ocNjoo^T-^T-coa>T-!^LOLOoScooo'7;'<-ojcococo^^-^-^CD 

M,i-ycok;cor^-cocoooooco^OvT:t^'(^)LCim^tuocoMi^i-^K:!Lr5y2LoLoy5cocop^co 

LU<^,^crOcoo^cocoQ^Cr^>cL>^^LJJ>^^LJJQLiJl-Ua:ZQZ^OCDi-1-^X 


NFE/NDFW 
weak  activating 
moderate  activating 
strong  activating 
Inhibitory/Inactivating 
no  data 


LJJ> 

OLU 

Ll_X 

0C/D 


protein  expression 
-1SD  +1SD 

lower  higher 


>>15- 


§10- 


S  5- 


2  o- 
2 
s: 


p=  IE-6 


p  <  0.0001 


.A iS  qfc 


NDFW/  moderate  strong 
inhibitory  activating  activating 

PIK3CA  functional  assay 


5-3 


5  2- 


! 

£  o 
2° 
s: 


p<0.01 


NDFW  activating 


PIK3R1  functional  assay 


Figure  4.  Functional  Assessment  of  Specific  PIK3CA  and  PIK3R1  Variants  by  Cell  Line  Viability  Assays 

(A  and  B)  Ba/F3  or  MCF-10A  cells  were  transfected  with  wild-type  (WT)  or  indicated  mutant  cDNA  of  PIK3CA  (A)  or  PIK3R1  (B)  then  cultured  for  4  weeks  and 
harvested  for  viability  assay.  The  extent  of  functionality  conferred  by  the  variant  is  indicated  by  colorgram.  NFE/NDFW,  no  functional  effect/no  difference  from 
wild-type.  For  the  mutant  variants  assessed,  corresponding  human  cancer  data  from  TCGA  are  shown,  including  frequency  of  the  variant  (relative  to  other 
variants  found  for  the  same  gene)  and  average  protein  expression  for  AKT  pS473  and  TSC2  pT1462.  Hotspot  residue,  from  Chang  et  al.  (2016). 

(C)  For  PIK3CA  (left)  and  PIK3R1  (right),  boxplots  of  variant  frequency  in  TCGA  human  tumors  (relative  to  other  variants  found  for  the  same  gene)  by  functional 
assays  results,  p  Values  by  Mann-Whitney  U  test.  Boxplots  represent  5%,  25%,  50%,  75%,  and  95%.  See  also  Figure  S4  and  Table  S5. 


again  highly  correlated  with  the  inhibitor  signature  score  (Fig¬ 
ures  5E,  S5C,  and  S5D). 

Molecular  Correlates  of  Patient  Survival  Involving 
PI3K/AKT/mTOR  Pathway  Components 

Molecular  correlates  of  cancer  patient  survival  can  offer  insights 
into  the  pathways  and  processes  underlying  more  aggres¬ 
sive  disease  (The_Cancer_Genome_Atlas_Research_Network, 

201 3).  For  specific  cancer  types  (e.g.,  breast  and  lung  adenocar¬ 
cinoma),  the  PI3K/AKT/mTOR  pathway  has  been  associated 
with  aggressive  disease  (The_Cancer_Genome_Atlas_Network, 

2012;  The_Cancer_Genome_Atlas_Research_Network,  2013). 

In  this  present  study,  we  sought  to  define  survival  correlates  in 
pan-cancer  analyses,  leveraging  the  large  numbers  of  patients 
available  (these  numbers  helping  to  balance  the  relatively  short 
patient  follow-up  times  that  characterize  a  number  of  individual 
TCGA  projects).  As  some  cancer  types  are  inherently  more 
aggressive  than  others  (Hoadley  et  al.,  2014),  we  carried  two 
separate  tests  for  each  molecular  feature  examined:  an  “uncor¬ 
rected”  test  across  all  cancers  regardless  of  type  and  a  “cor¬ 
rected”  test  incorporating  cancer  type  (by  TCGA  project)  as  a 
covariate.  Features  more  strongly  associated  with  an  aggressive 
cancer  type  but  having  a  survival  association  that  was  not  inde¬ 
pendent  of  cancer  type  ( PIK3CA  mutation,  for  example,  Floadley 
et  al.,  2014)  may  show  significance  for  the  uncorrected  but  not 
the  corrected  survival  test. 


Numerous  protein  expression  or  genomic  alteration  features 
involving  PI3K/AKT/mTOR  pathway  members  were  significantly 
associated  with  patient  outcome  in  pan-cancer  analyses  (Fig¬ 
ure  6A),  a  number  of  these  features  remaining  significant  after 
correcting  for  cancer  type.  Features  significantly  associated 
with  worse  patient  outcome,  independent  of  cancer  type, 
included  STK11  mutation,  STK11  copy  loss,  PTEN  copy  loss, 
PIK3CA  amplification,  and  higher  phospho-4EBP1  expression. 
Focusing  on  PTEN  and  STK11  copy  alterations,  these  features 
were  found  significant  within  several  individual  cancer  types, 
with  the  aggregated  patterns  across  cancer  types  denoting 
pan-cancer  significance  (Figure  6B).  Interestingly,  for  both 
PTEN  and  STK11,  low-level  deletion  (approximating  partial 
copy  loss)  but  not  high-level  deletion  (approximating  total  loss) 
was  associated  with  significantly  worse  outcome  compared 
with  wild-type  (Figures  6C  and  6D);  loss  of  one  copy  combined 
with  somatic  mutation  of  the  other  copy  was  associated  with 
the  poorest  outcome.  For  both  PTEN  and  STK11,  neither  high- 
level  deletion  nor  mutation  without  copy  loss  could  be 
associated  with  worse  outcome,  where  in  this  instance,  survival 
differences  by  tumor  type  were  a  likely  confounder  (e.g.,  65%  of 
the  PTEN  mutation  with  no  copy  alteration  group  were  UCEC,  or 
uterine  corpus  endometrial  carcinoma,  cases).  As  a  group,  gene 
transcription  targets  of  the  PI3K/AKT/mTOR  pathway  (based  on 
the  signature  described  in  Figure  5A)  were  also  associated  with 
worse  patient  outcome  (Figure  6E). 
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Figure  5.  Survey  of  Two  Distinct  PI3K/AKT/mTOR-Associated  Gene  Transcription  Signatures  across  Human  Cancers 

(A)  A  previously  defined  gene  transcription  signature  of  PI3K/AKT/mTOR  (Creighton  et  al.,  201 0)  (originally  derived  using  the  Connectivity  Map,  or  CMAP,  dataset) 
was  re-examined  in  the  LINCS  database  of  perturbational  expression  profiles,  with  the  PI3K/AKT/mTOR  inhibitor  treatment  group  compared  with  control  group. 

(B)  The  PI3K/AKT/mTOR  “CMAP”  signature  was  evaluated  against  the  LINCS  expression  profiles  of  cells  treated  with  shRNAs  for  ^6,000  different  genes.  In  the 
plot  shown,  shRNAs  are  ranked  according  to  the  overall  similarities  in  their  induced  expression  patterns  with  those  of  the  PI3K/AKT/mTOR  signature;  for  example, 
for  shRNAs  represented  on  the  left  of  x  axis,  knock  down  of  the  gene  results  in  a  pattern  inverse  of  that  of  the  PI3K/AKT/mTOR  signature.  Red,  canonical  promoter 
of  PI3K/AKT/mTOR  pathway;  blue,  canonical  suppressor. 

(C)  TCGA  pan-cancer  mRNA  profiles  (n  =  1 0,224  cases)  were  each  scored  for  various  transcriptional  signatures  associated  with  PI3K/AKT/mTOR,  MYC,  or  k-ras 
pathways  (defined  previously  using  experimental  models).  Pearson’s  correlations  between  indicated  transcriptional  and  proteomic  signature  scores  across  the 
pan-cancer  profiles  are  indicated,  along  with  correlations  of  the  signatures  with  specific  genomic  alterations. 

(D)  A  gene  expression  signature  of  sensitivity  to  PI3K/AKT/mTOR  inhibition  in  cancer  cell  lines,  consisting  of  1 46  genes  (p  <  0.01  by  t  test  and  p  <  0.01  in  regression 
model  incorporating  tumor  type  as  a  confounder),  was  derived  using  the  dataset  of  Garnett  et  al.  (2012). 

(E)  Top:  for  cell  lines  with  both  RPPA  and  mRNA  data  (n  =  231),  Pearson’s  correlations  between  key  PI3K/AKT/mTOR  proteins  and  PI3K/AKT/mTOR  inhibition 
sensitivity,  as  defined  by  either  drug  treatment  or  gene  signature  from  (D).  Bottom:  TCGA  pan-cancer  mRNA  profiles  were  each  scored  for  the  drug  sensitivity 
signature  from  (D);  Pearson’s  correlations  across  the  pan-cancer  profiles,  involving  transcriptional  and  proteomic  signature  scores  and  selected  genomic 
features,  are  indicated.  See  also  Figure  S5  and  Table  S6. 


Genetic/Genomic  Alteration  Classes  in  Relation  to 
PI3K/AKT/mTOR  Pathway  Activation 

We  then  sought  to  examine  the  effects  on  pathway  activation  of 
some  key  genomic  events  in  the  tumors  in  which  they  occurred 
(including  mutations  represented  in  Figure  2A  and  copy  alter¬ 
ations  involving  PIK3CA,  PTEN,  and  STK1 1).  Of  the  7,099  tumor 
cases  examined  (with  both  mutation  and  protein  data),  4,468 
(63%)  harbored  at  least  one  nonsilent  somatic  mutation  or 
copy  alteration  involving  PI3K/AKT/mTOR  pathway  (Figures  7 A 


and  S6A).  Another  set  of  764  tumors  showed  high  levels  of 
phospho-AKT  (>0.5  SD  of  pS473  from  the  median  across 
samples)  but  without  any  of  the  genetic  or  genomic  alterations 
associated  with  the  above  4,468  tumors,  and  another  set  of 
394  tumors  showed  low  levels  of  phospho-AMPK  (<-0.5  SD) 
without  an  associated  genetic  or  genomic  alteration.  In  compar¬ 
ison  with  a  set  of  tumors  that  did  not  show  pathway  alteration 
at  the  DNA  or  protein  level  (an  “unaligned”  set,  n  =  1 ,058),  muta¬ 
tion  or  copy  alteration  of  individual  PI3K/AKT/mTOR  pathway 
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Figure  6.  Pan-Cancer  Molecular  Correlates  of  Patient  Survival  Involving  PI3K/AKT/mTOR  Pathway  Components 

(A)  Pathway  diagram  representing  molecular  features  at  the  levels  of  mRNA  (using  n  =  10,152  cancer  cases  in  total  with  both  mRNA  and  survival  data),  protein 
(n  =  7,532),  copy  number  (n  =  1 0,685),  and  somatic  mutation  (n  =  1 0,054).  Red,  significant  correlation  with  worse  patient  outcome;  blue,  significant  correlation  with 
better  outcome.  “Tumor  type  corrected”  survival  p  values  denote  significant  correlation  in  model  incorporating  both  the  molecular  feature  and  cancer  type, 
p  Values  <0.05  correspond  to  an  estimated  false  discovery  rate  (Storey  and  Tibshirani,  2003)  of  <10%. 

(B)  Forest  plots  of  hazard  ratios  by  tumor  type  (with  95%  confidence  intervals)  for  patient  death  for  PTEN  copy  alteration  (left)  and  for  STK1 1  copy  alteration  (right). 
Hazard  ratios  based  on  log  (tumor/normal)  copy  values;  hazard  ratio  less  than  1  (blue)  denotes  trend  of  copy  loss  with  worse  outcome,  p  Value  for  overall  survival 
correlation  by  meta-analysis  fixed  effects  model.  Asterisks  denote  cancer  types  that  were  individually  significant  (p  <  0.05). 

(C  and  D)  Kaplan-Meier  plot  of  overall  survival  of  patients  stratified  by  PTEN  (C)  or  STK11  (D)  alteration.  Low  del.,  low-level  deletion  (partial  loss,  no  detected 
mutation);  high  del.,  high-level  deletion  (approximating  total  loss);  mut.,  somatic  nonsilent  mutation  (no  copy  loss);  mut.  +  del.,  copy  loss  combined  with  mutation. 
Corrected  p  values  by  stratified  log  rank  test  incorporate  cancer  type  as  a  confounder.  Asterisks  denote  groups  significantly  different  from  wild-type  (WT)  group 
by  stratified  log  rank  test. 

(E)  Kaplan-Meier  plot  of  overall  survival  of  patients  stratified  by  PI3K/AKT/mTOR  transcriptional  signature  (CMAP  signature).  Corrected  p  values  by  stratified  log 
rank  test  incorporate  cancer  type  as  a  confounder. 


members  in  general  could  be  associated  with  higher  PI3K/AKT 
or  mTOR  signaling  as  measured  by  protein  arrays  (Figures  7B 
and  S6B).  Notably,  STK11  alteration  or  low  phospho-AMPK 
was  strongly  associated  with  high  mTOR  signaling,  but  not 
with  high  PI3K/AKT  signaling,  consistent  with  the  LKB1/AMPK 
pathway  acting  on  mTOR  independently  of  PI3K/AKT  (Figure  2A). 
Mutations  associated  with  RTK  signaling  were  not  strongly  asso¬ 
ciated  with  PI3K/AKT/mTOR  activation  (Figures  7A  and  7B), 
indicative  of  decoupling  between  PI3K/AKT/mTOR  and  RTK. 
Low-level  as  well  as  high-level  copy  losses  of  PTEN  and 
STK11  could  be  associated  with  greater  mTOR  signaling. 

PI3K/AKT/mTOR  pathway  activity,  when  measured  at  the  pro¬ 
tein  level,  was  explained  by  known  mutations  or  copy  alteration 
in  most  but  not  all  of  the  cases  examined,  suggesting  additional, 
unexplained,  or  underappreciated  mechanisms  of  pathway  acti¬ 
vation.  Focusing  on  the  “High  P-AKT”  tumor  group  (n  =  764), 
with  high  phospho-AKT  but  lacking  a  DNA  alteration  classically 


associated  with  PI3K/AKT  activation,  these  tumors  were  highly 
enriched  for  specific  cancer  types  including  LGG,  PRAD,  KIRC, 
and  PCPG  (Figure  7C),  as  well  as  for  IDH1  mutations  (associated 
primarily  with  LGG,  i.e.,  gliomas)  and  VHL  mutations  (associated 
with  renal  cancers).  A  set  of  microRNAs  could  also  help  distin¬ 
guish  the  “High  P-AKT”  group  (Figure  S6C).  Proteins  that  were 
highly  expressed  specifically  within  the  High  P-AKT  group  (Fig¬ 
ure  7D)  included  phospho-ERK,  phospho-SRC,  and  phospho- 
NDRG1.  These  mutations  and  proteins  would  suggest  a  model 
(Figure  7E)  whereby  mutant  IDH1  may  lead  to  high  phospho- 
ERK  (Chaturvedi  et  al.,  2013)  and  SRC  can  activate  PI3K  (Chen 
et  al.,  201 5;  Su  et  al.,  201 6),  and  where  activated  mTOR  signaling 
may  activate  transcription  targets  of  hypoxia  via  HIF-1a  (partic¬ 
ularly  in  the  absence  of  VHL),  including  NDRG1  and  growth  fac¬ 
tors  that  may  lead  to  a  further  increase  ERK  and  PI3K  signaling 
(Clark,  2009).  Notably,  VHL  was  recently  found  to  directly 
suppress  AKT  activity  (Guo  et  al.,  2016),  and  generation  of 


828  Cancer  Cell  31,  820-832,  June  12,  2017 


CelPress 


n=415  n=1158 


II  I  I  II  i  III 


AKT  pS473 
AMPK  pT172 
ERK  p202/204 
PI3K-AKT  score 
mTOR  score 


differential  -iSD  ■ 
expression  |0wer 


t - 1 - r 

HIGH  LOW  unaligned 
P-AKT  P-AMPK 


B 


tumor  type 


BRCA,  CESC, 
CHOL,  CRC, 
DLBC,  ESCA, 
GBM,  HNSC, 
KICH,  KIRC, 
KIRP,  LAML, 
LGG,  LIHC, 
LUAD,  LUSC, 
MESO,  OV, 
PAAD,  PCPG, 
PRAD, 

,  STAD, 
TGCT,  THCA, 
THYM,  UCEC, 
UCS,  UVM 


*p<0.0005 
vs  unaligned 

n.s.  :  p>0.01 


Q  -Iog10(p  value) 

enrichment,  high  P-AKT 
0  20  40  60  80  100  120  140 


PRAD 
KIRC 
PCPG  H 
KIRP  - 
ACC  h 
THYM  ■ 

IDH1  mut. 

VHL  mut.  "  ** 
IDH2  mut.  > 


*  enriched  within 
LGG+high  P-AKT 

**  enriched  within 
KIRC/KIRP+high  P-AKT 

* 


PI3K/RTK  HIGH  LOW 

D  genomic  alt.  P-AKT  P-AMPK  unaligned 


53BP1 

YB1 

P-CADHERIN 
ASNS 
CYCLIN  B1 
PCNA 
q  q  TFRC 

CO  CO  XRCC1 

7  +  DUSP4 

CYCLIN  D1 
MEK1  p21 7/221 
ERK  p202/204 
SRC  pY527 
PKC  (311  pS660 
P38  pi 80/1 82 
YAP  pS127 
NDRG1  pT346 
P21 


Figure  7.  Tumor  Classes  as  Defined  by  PI3K/AKT/mTOR-Related  Alterations 

(A)  Tumor  cases  were  separated  into  distinct  groups  on  the  basis  of  genetic  or  genomic  alteration  and  of  protein  expression:  (1)  cases  with  nonsilent  somatic 
mutation  or  copy  alteration  involving  selected  PI3K/AKT/mTOR  pathway  members  as  shown  (left  side,  n  =  4,468  cases),  (2)  additional  cases  with  nonsilent 
mutation  involving  selected  receptor  tyrosine  kinase  (RTK)-associated  genes  (n  =  415  cases),  (3)  cases  with  high  phospho-AKT  (HIGH  P-AKT)  but  with  none  of  the 
above  somatic  alterations  (n  =  764  cases),  (4)  cases  with  LOW  phospho-AMPK  (LOW  P-AMPK)  but  with  none  of  the  above  somatic  alterations  (n  =  394  cases), 
(5)  cases  not  aligned  with  any  of  the  above  (unaligned,  n  =  1 ,058  cases).  AKT/MTOR/PIK3CA/PIK3R1/PTEN  mutations  represent  “predicted  functional”  mutations 
from  Figure  3D.  Other  mut.  track  involves  nonsilent  mutations  for  other  genes  represented  in  Figure  2A  (STAR  Methods  and  Figure  S6).  Protein  values  and 
proteomic  scores  normalized  to  SDs  from  the  median. 

(B)  Boxplots  of  PI3K/AKT  (top)  and  mTOR  (bottom)  pathway  activity  scores  by  alteration  class,  p  Values  by  t  test  on  log-transformed  values,  n.s.,  not  significant 
(p  >  0.01).  Boxplots  represent  5%,  25%,  50%,  75%,  and  95%. 

(C)  Enriched  tumor  types  and  mutations  within  the  HIGH  P-AKT  group,  p  Values  by  one-sided  Fisher’s  exact  test.  IDH1  and  VHL  mutation  events  were  significant 
(p  <  IE-10  and  p  <  0.01,  respectively)  when  limiting  the  analysis  to  LGG  and  to  KIRC/KIRP  (renal)  cases,  respectively. 

(D)  Top  differentially  expressed  proteins  in  HIGH  P-AKT  group  compared  to  unaligned  and  PI3K-altered  groups  (see  STAR  Methods),  not  including  core  PI3K/ 
AKT/mTOR  members. 

(E)  Diagram  of  interactions  involving  PI3K/AKT/MTOR  pathway  represented  by  selected  features  from  (C)  and  (D)  (Carbonneau  et  al. ,  201 6;  Dodd  et  al. ,  201 5;  Guo 
et  al.,  2016;  Weiler  et  al.,  2014),  with  differential  protein  expression  patterns  represented,  comparing  tumors  in  HIGH  P-AKT  group  with  tumors  harboring  PI3K/ 
RTK  genomic  alteration  or  with  unaligned  tumors,  p  Values  by  t  test  on  log-transformed  data.  See  also  Figure  S6. 


2-hydroxyglutarate  by  mutated  IDH1/2  was  also  recently  found 
to  lead  to  the  activation  of  mTOR  (Carbonneau  et  al.,  2016); 
our  data  here  would  highlight  the  importance  of  both  of  the 
above  relationships  in  the  setting  of  human  cancer. 


DISCUSSION 

TOGA  pan-cancer  datasets  have  enabled  us  to  examine  human 
tumor  correlations  in  the  context  of  PI3K/AKT/mTOR  to  an  extent 
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not  previously  possible.  Our  current  model  of  the  PI3K/AKT/ 
mTOR  pathway  has  developed  over  the  course  of  numerous  in¬ 
dependent  molecular  biology  studies,  spanning  decades  of 
research.  In  large  part,  our  understanding  of  the  pathway  mem¬ 
bers  and  interactions  involved  has  been  derived  from  experi¬ 
mental  systems,  including  cell  lines.  While  cell  lines  may  uncover 
cause-and-effect  relationships  in  vitro,  the  relevance  of  such  re¬ 
lationships  in  the  setting  of  human  diseases  such  as  cancer  may 
not  always  be  clear  from  these  data  alone.  On  the  other  hand, 
molecular  data  from  human  tumors  provide  correlative  (although 
not  necessarily  causal)  relationships  that  would  have  relevance 
to  disease  in  the  human  setting.  Most  of  the  correlations 
observed  in  our  study  fit  well  with  our  understanding  of  PI3K/ 
AKT/mTOR  signaling,  in  particular  the  genetic  or  genomic  alter¬ 
ation  of  specific  genes  having  an  impact  on  phospho-protein 
expression  of  key  downstream  intermediates.  Genes  or  alter¬ 
ation  classes  that  were  previously  underappreciated  would 
also  be  found  relevant  in  our  study,  including  partial  loss  of 
PTEN  or  STK11  (associated  with  both  worse  survival  and 
increased  mTOR  signaling).  Where  gene  mutation  often  inacti¬ 
vates  one  allele,  loss  of  one  allele  by  copy  alteration,  which  is 
common  across  multiple  cancer  types  for  both  PTEN  and 
STK11,  would  presumably  have  the  same  impact  on  loss  of 
gene  function.  IDH1  and  VHL  mutations  would  also  be  impli¬ 
cated  here  with  PI3K/AKT/mTOR,  where  such  alterations  were 
associated  with  particularly  high  AKT/mTOR  signaling,  and 
which  genes  might  be  put  forth  for  consideration  as  part  of  the 
“canon”  of  what  would  be  recognized  to  constitute  the  core 
standard  model  of  the  PI3K/AKT/mTOR  pathway. 

The  multiplatform  molecular  datasets  offered  by  TOGA  allow 
for  a  more  comprehensive  view  of  the  PI3K/AKT/mTOR 
pathway.  Pathway  alterations  in  cancer  may  be  manifested  at 
different  levels  of  molecular  complexity,  from  DNA  to  protein  to 
transcriptional  consequences.  Integration  with  RPPA  proteomic 
data  allows  us  to  assess  the  impact  on  pathway  activation  of  mu¬ 
tations  or  copy  alterations  observed  at  the  DNA  level.  As 
observed  in  this  study,  multiple  oncogenic  pathways  in  addition 
to  PI3K/AKT/mTOR  may  regulate  similar  sets  of  transcriptional 
targets,  where  transcriptional  patterns  would  represent  a  degree 
of  separation  from  the  pathway  as  manifested  at  the  protein 
level.  Phospho-protein  levels  may  only  be  assessed  by  protein 
data  and  not  mRNA  data,  which  also  represents  an  advantage 
of  RPPA  compared  with  other  proteomic  approaches  (Creighton 
and  Huang,  2015).  Clear  overall  trends  may  be  observed  when 
integrating  proteomic  data  with  data  from  other  platforms, 
although  statistical  trends  (e.g.,  visualized  as  boxplots)  would 
apply  to  groups  of  patients  and  not  always  to  the  individual  pa¬ 
tient,  which  has  implications  regarding  personalized  therapy. 
Various  sources  of  biological  noise,  in  addition  to  technical 
noise,  may  be  present  within  human  tumors,  which  give  rise  to 
variation  in  molecular  signals.  Widespread  molecular  aberrations 
involving  numerous  genes  and  pathways  within  a  given  tumor, 
clonal  heterogeneity,  microenvironmental  influences,  variable 
sample  purity,  and  tissue-specific  effects  can  all  add  noise  to 
our  ability  to  match  protein  signals  with  specific  DNA  alterations. 
The  RPPA  methodology  may  have  limitations  as  well  (e.g.,  anti¬ 
body  robustness,  unknown  history  of  sample  material  used  to 
measure  potentially  labile  phosphorylations,  linearity  of  signal 
readout,  etc.),  and  instances  where  proteomic  signals  would 


seem  disconnected  from  other  molecular  profile  features  of  a 
particular  tumor  may  be  difficult  to  interpret.  The  power  of  large 
sample  numbers  and  the  opportunities  for  data  integration 
offered  by  TCGA  pan-cancer  cohort  can  aid  greatly  in  detecting 
robust  patterns  relevant  to  our  understanding  aspects  of 
pathway  deregulation. 

Results  of  this  study  include  a  comprehensive  and  annotated 
catalog  of  PI3K/AKT/mTOR-associated  variants  across  over 
10,000  tumors,  which  may  serve  as  an  additional  resource  for 
assessing  variants  in  the  clinical  setting.  One  of  the  challenges 
of  applying  personalized  and  precision  medicine  approaches 
to  cancer  therapy  is  the  large  number  of  gene  alterations  that 
may  be  found  within  a  given  patient’s  tumor.  Stratifying  patients 
by  mutation  status,  e.g.,  PIK3CA  mutation,  has  been  shown  to 
increase  response  rates  in  clinical  trials  testing  inhibitors  to 
PI3K/AKT/mTOR  pathway,  although  non-responders  are  still 
common  (llagan  and  Manning,  201 6).  Not  all  genetic  variants  im¬ 
pacting  a  given  gene  would  necessarily  have  a  similar  impact  on 
its  function,  including  a  large  fraction  of  observed  PIK3CA  vari¬ 
ants.  Oncogenic  variants  that  are  found  to  occur  frequently  or 
are  associated  with  a  significant  pattern  would  seem  likely  to 
be  functionally  relevant.  Other  measures  of  predicting  variant 
functionality  include  in  silico  structural  predictions,  in  vitro  func¬ 
tional  assays,  domain-specific  expertise,  and  protein  expres¬ 
sion,  all  of  which  were  explored  to  varying  extents  in  the  present 
study.  In  practice,  multiple  measures  may  be  needed,  as  no 
single  measure  may  capture  all  of  the  variants  likely  to  be  func¬ 
tional.  In  addition,  the  RPPA  proteomic  platform  would  have 
potential  for  clinical  applications  to  personalized  therapy 
(Creighton  and  Huang,  2015),  and  transcriptional  signatures 
associated  with  inhibitor  sensitivity  in  cell  lines  may  be  defined 
(Singh  et  al.,  2009).  The  focused,  comprehensive  analysis  on 
the  PI3K/AKT/mTOR  pathway  here  will  serve  as  a  valuable 
resource  for  understanding  its  deregulation  in  cancers  and 
how  to  maximize  its  clinical  utility. 
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STAR*METHODS 

KEY  RESOURCES  TABLE 


REAGENT  or  RESOURCE 

SOURCE 

IDENTIFIER 

Deposited  Data 

TCGA  whole  exome  DNA  sequence  data 

Unified  Ensemble  “MC3”  Call  Set 

from  DNA  Nexus 

https://www.synapse.Org/#ISynapse:syn7214402 

TCGA  protein  expression  data  by  RPPA 

TCPA  Portal,  Level  4 

htt  p  ://tc  papo  rtal .  o  rg/tc  pa/ 

TCGA  RNA  expression  by  RNA-seq 

Broad  Firehose  Datasets 

gdac.broadinstitute.org 

DNA  copy  number  alteration  by  Affymetrix 
SNP  6  array 

Broad  Firehose  Datasets 

https://gdac.broadinstitute.org 

TCGA  whole  genome  DNA  sequence 

Genomic  Data  Commons 

https://gdc.cancer.gov/ 

Perturbational  expression  profiles 
(compounds,  shRNAs) 

BROAD  LINCS  database 

http://www.lincsproject.org/ 

mRNA  and  drug  sensitivity  measurements 
in  cancer  cell  lines 

Genomics  of  Drug  Sensitivity  in 
Cancer  Portal 

http://www.cancerrxgene.org/ 

RPPA  profiles  of  cancer  cell  lines 

MCLP  Data  Portal 

http://tcpaportal.org/mclp/ 

Experimental  Models:  Cell  Lines 

MCF10A 

ATCC 

Authenticated  by  Short  Tandem  Repeat  (STR) 
analysis  at  M.D.  Anderson  Characterized  Cell 

Line  Core  facility  (Houston,  TX) 

Ba/F3 

M.D.  Anderson  Characterized  Cell 

Line  Core  facility  (Houston,  TX) 

Parental  cells  validated  based  on  continued 
dependence  on  IL3  for  propagation  (mouse- 
originated  cell  line) 

CONTACT  FOR  REAGENT  AND  RESOURCE  SHARING 

Further  information  and  requests  for  resources  and  reagents  should  be  directed  to  and  will  be  fulfilled  by  the  Lead  Contact,  Chad  J. 
Creighton  (creighto@bcm.edu). 

EXPERIMENTAL  MODEL  AND  SUBJECT  DETAILS 

Human  Subjects 

Cancer  molecular  profiling  data  were  generated  through  informed  consent  as  part  of  previously  published  studies  and  analyzed  in 
accordance  with  each  original  study’s  data  use  guidelines  and  restrictions. 

Cell  Lines 

Assay  medium  for  survival  assay  were  Advanced  RPMI  1640  medium  (Life  Technologies)  with  5%  FBS  (Life  Technologies)  and 
lx  GlutaMAX  (Life  Technologies)  for  Ba/F3  cells  and  MEBM  Basal  medium  (Lonza)  with  100  ng/ml  Cholera  toxin  (Lonza)  and 
52  ng/ml  Bovine  Pituitary  Extract  (BPE)  (Lonza)  for  MCF10A  cells. 

METHOD  DETAILS 

TCGA  Patient  Cohort 

The  results  here  are  based  upon  data  generated  by  TCGA  Research  Network  (http://cancergenome.nih.gov/).  Molecular  data  from 
11219  human  cancers  were  aggregated  from  public  repositories  (Table  SI).  Tumors  spanned  32  different  TCGA  projects,  each 
project  representing  a  specific  cancer  type,  listed  as  follows:  LAML,  Acute  Myeloid  Leukemia;  ACC,  Adrenocortical  carcinoma; 
BLCA,  Bladder  Urothelial  Carcinoma;  LGG,  Brain  Lower  Grade  Glioma;  BRCA,  Breast  invasive  carcinoma;  CESC,  Cervical 
squamous  cell  carcinoma  and  endocervical  adenocarcinoma;  CHOL,  Cholangiocarcinoma;  CRC,  Colorectal  adenocarcinoma 
(combining  COAD  and  READ  projects);  ESCA,  Esophageal  carcinoma;  GBM,  Glioblastoma  multiforme;  HNSC,  Head  and  Neck  squa¬ 
mous  cell  carcinoma;  KICH,  Kidney  Chromophobe;  KIRC,  Kidney  renal  clear  cell  carcinoma;  KIRP,  Kidney  renal  papillary  cell  carci¬ 
noma;  LIHC,  Liver  hepatocellular  carcinoma;  LUAD,  Lung  adenocarcinoma;  LUSC,  Lung  squamous  cell  carcinoma;  DLBC, 
Lymphoid  Neoplasm  Diffuse  Large  B-cell  Lymphoma;  MESO,  Mesothelioma;  OV,  Ovarian  serous  cystadenocarcinoma;  PAAD, 


Cancer  Cell  31,  820-832.e1-e3,  June  12,  2017  el 


CelPress 


Pancreatic  adenocarcinoma;  PCPG,  Pheochromocytoma  and  Paraganglioma;  PRAD,  Prostate  adenocarcinoma;  SARC,  Sarcoma; 
SKCM,  Skin  Cutaneous  Melanoma;  STAD,  Stomach  adenocarcinoma;  TGCT,  Testicular  Germ  Cell  Tumors;  THYM,  Thymoma; 
THCA,  Thyroid  carcinoma;  UCS,  Uterine  Carcinosarcoma;  UCEC,  Uterine  Corpus  Endometrial  Carcinoma. 

Datasets 

Proteomic  data  were  generated  by  RPPA  across  7663  patient  tumors  obtained  from  TCGA.  RPPA  methodology  and  quality  control 
procedures  have  been  described  previously  (Akbani  et  al.,  2014;  Li  et  al. ,  2017).  In  total,  225  high-quality  antibodies  targeting  total 
(n=1 66),  cleaved  (n=2),  acetylated  (n=1 )  and  phosphoproteins  (n=56)  were  used.  The  entire  set  of  RPPA  Pan-Cancer  samples  was  run 
in  several  different  batches,  resulting  in  potential  batch  effects  on  merging  the  sets;  replicates-based  normalization  (RBN)  (Akbani 
et  al.,  2014),  was  therefore  applied,  using  replicate  samples  run  across  multiple  batches  to  adjust  the  data  for  batch  effects.  Data 
(“Level  4”)  are  available  from  The  Cancer  Proteome  Atlas  (http://tcpaportal.org/tcpa/). 

RNA-seq  and  miRNA-seq  data  were  obtained  from  The  Broad  Institute  Firehose  pipeline  (http://gdac.broadinstitute.org/).  All  RNA- 
seq  samples  were  aligned  using  the  by  UNC  RNA-seq  V2  pipeline  (The_Cancer_Genome_Atlas_Research_Network,  2013).  For 
miRNA-seq  data,  only  sample  profiles  from  the  Hiseq  platform  were  used  (representing  n=8690  cases). 

DNA  from  each  tumor  or  germline-derived  sample  was  hybridized  to  Affymetrix  SNP  6.0  arrays  as  previously  described  (The_ 
Cancer_Genome_Atlas_Research_Network,  201 3)(n=1 0845  tumor  profiles  in  all).  GISTIC  2.0  was  applied  to  the  transformed  copy 
number  data,  with  a  noise  threshold  used  to  determine  copy  gain  or  loss.  Low-level  gene  gain,  high-level  gene  amplification,  low-level 
copy  loss,  or  high-level  copy  loss  were  inferred  using  the  “thresholded”  calls  as  made  by  Broad  Firehose  pipeline  (using  +1 ,  +2,  -1 , 
or  -2,  respectively).  High-level  amplifications  denotes  amplifications  above  the  threshold  and  larger  than  the  arm  level  amplifications 
observed  for  the  given  sample.  Low-level  copy  deletions  represent  deletion  above  the  threshold  (approximating  heterozygous  de¬ 
letions  in  the  absence  of  whole  genome  doubling);  high-level  copy  deletions  denote  copy  losses  above  the  threshold  and  greater 
than  the  minimum  arm-level  deletion  observed  for  the  sample  (approximating  homozygous  deletions  in  the  absence  of  whole  genome 
doubling).  Log  (tumor/normal)  copy  values  were  used  to  evaluate  correlations  with  survival  in  Figure  6A. 

Somatic  mutation  calls  were  obtained  from  the  publicly-available  “MC3”  TCGA  MAF  file  (covering  n=1 0224  patients,  https://www. 
synapse. org/#!Synapse:syn721 4402).  This  MC3  set  is  a  re-calling  of  uniform  files  from  all  TCGA  projects,  with  variant  calling  using  a 
standardized  set  of  mutation  callers.  The  BAM  files  used  underwent  a  standardized  local  re-alignment  to  hg19  (Genome  Reference 
Consortium  GRCh37),  six  calling  algorithms  were  applied,  and  a  number  of  automated  filters  were  applied.  Variants  called  by  two  or 
more  algorithms  were  used  in  the  study.  Whole  genome  sequence  analysis  was  carried  out  for  1363  cases  (with  paired  normal  sam¬ 
ples,  high  pass  coverage  for  BRCA  and  OV  cases,  low  pass  for  BLCA,  CESC,  CRC,  ESCA,  HNSC,  LGG,  LUAD,  PRAD,  SKCM,  STAD, 
THCA,  UCEC,  and  UVM).  Genomic  rearrangements  were  detected  in  all  tumor  and  normal  genomes  by  Meerkat  (Yang  et  al.,  2013). 
Five  discordant  read  pairs  support  are  required  for  each  event.  Each  event  was  detected  in  tumor  genome  was  filtered  by  all  normal 
genomes  to  ensure  it  represented  a  somatic  event. 

Gene  and  Protein  Signatures 

Pan-cancer  RPPA  profiles  were  scored  for  a  PI3K/AKT  pathway  signature,  defined  as  the  sum  of  normalized  phosphoprotein  levels  of 
AKT  (both  S473  and  T308  RPPA  features),  GSK3  (S9  and  S21/S9  features),  PRAS40,  and  phospho-TSC2.  RPPA  profiles  were  also 
scored  for  an  mTOR  pathway  signature,  defined  as  the  sum  of  phosphoprotein  levels  of  mTOR,  4EBP1  (S65,  T37/T46,  and  T70  RPPA 
features),  P70S6K,  and  S6  (S235/S236  and  S240/S244  features). 

Gene  transcriptional  signatures  of  PI3K/AKT/mTOR  pathway  were  defined  as  described  previously  (Creighton  et  al.,  201 0):  “Saal” 
PTEN  loss  signature,  genes  correlated  with  Pten  protein  levels  in  breast  cancer;  “CMap”  PI3K/AKT/mTOR  signature,  genes  modu¬ 
lated  in  vitro  by  inhibitors  to  PI3K  or  mTOR,  according  to  CMap  dataset  (p<0.01 ,  comparing  PI3K/mTOR-inhibited  cells  with  the  rest 
of  the  Cmap  profiles);  “Majumder”  Akt  signature,  genes  modulated  in  a  mouse  model  of  inducible  AKT  (p<0.01).  MYC  signatures 
(Coller  and  Bild)  and  the  Bild  Ras  signature  were  from  ref  (Creighton,  2008),  and  the  Settleman  k-ras  sensitivity  signature  were 
from  ref  (Singh  et  al.,  2009).  For  a  given  gene  transcription  signature,  we  extracted  the  expression  values  from  the  TCGA  gene  expres¬ 
sion  array  dataset.  For  each  gene,  we  normalized  expression  values  to  standard  deviations  from  the  median  across  tumors.  For 
signatures  with  “up”  versus  “down”  genes,  we  computed  our  previously  described  “t-score”  (Creighton  et  al.,  2010)  to  score 
each  tumor  profile  for  relative  manifestation  of  the  signature. 

For  deriving  a  PI3K/AKT/mTOR  drug  sensitivity  signature  in  cell  lines  (Figure  5D),  we  utilized  the  dataset  from  Garnett  et  al.  (Garnett 
et  al.,  2012).  For  the  11  inhibitors  to  PI3K/AKT/mTOR  represented  in  Garnett  (including  Rapamycin:MTOR,  JW-7-52-1  :MTOR, 
A-443654:AKT1/2/3,  CHIR-99021  :GSK3B,  AZD6482:PI3Kb  (P3C2B),  AKT  inhibitor  VIII:AKT1/2,  Temsirolimus:MTOR,  MK-2206: 
AKT1/2,  NVP-BEZ235:PI3K  (class  1)  and  mTORC1/2,  GDC0941:PI3K  (class  1),  and  AZD8055:mTORC1/2),  we  normalized  IC50 
values  to  standard  deviations  from  median,  then  average  to  get  single  drug  sensitivity  score.  Each  gene  was  correlated  in  expression 
with  the  drug  sensitivity  score,  first  selecting  for  genes  significant  with  p<0.01  by  t-test  on  log-transformed  data  (1099  significant 
genes),  then  further  selecting  for  genes  remaining  significant  after  correcting  for  tissue  type  differences  using  a  regression  model 
that  incorporates  tumor  type  as  a  confounder  (146  genes  with  corrected  p<0.01). 

In  Silico  Mutation  Evaluation 

In  assessing  whether  mutations  may  be  more  or  less  likely  to  have  a  functional  effect  on  the  resulting  protein,  a  number  of  factors 
were  considered.  Somatic  substitution  hotspots  (470  in  total  involving  275  genes),  based  on  a  previous  pan-cancer  analysis  of 
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11119  human  tumors  (Chang  et  al.,  201 6),  were  incorporated  into  the  present  study  where  noted.  Mutation  Assessor  calls  predicting 
the  functional  impact  (medium  to  high)  of  somatic  mutation  (Reva  et  al.,  2011)  were  obtained  from  cBioPortal  (Cerami  et  al.,  2012). 
Manual  review  of  variants  involving  AKT1/2/3,  MTOR,  PIK3CA,  PTEN,  RHEB,  TSC1/2  was  also  carried  out  by  domain  experts  in  the 
analysis  group.  Mutations  that  were  predicted  as  potentially  functional  by  any  of  the  above— as  well  as  mutations  in  tumor  suppressor 
genes  (e.g.  PTEN,  PIK3R1)  classified  as  nonsense,  frameshift,  or  indel— were  evaluated  separately  with  respect  to  comparing  with 
AKT  pS473  phospho-protein  expression  (Figure  3D). 

Cell  Line  Viability  Assays 

The  effects  of  mutations  on  the  function  of  PIK3CA  and  PIK3R1  were  assessed  in  Ba/F3  and  MCF1 0A  by  survival  assay  as  previously 
described  (Dogruluk  et  al.,  2015)  with  lentiviral  vector  pHAGE  used  in  the  cloning.  In  Ba/F3,  the  PIK3CA  mutations  were  assigned  as 
“Strong  activating  (SA)”  if  the  mutations  have  an  activity  higher  than  Ml  0431  (known  moderate  driver);  as  “Moderate  activating  (MA)” 
if  the  mutations  have  a  similar  or  lower  activity  than  Ml  0431;  as  “No  difference  from  WT  (NDFW)”  if  the  mutations  have  a  similar 
activity  with  WT;  or  as  “Inactivating  (INA)”  if  the  mutations  have  an  activity  similar  to  negative  controls  (GFP/mCherry/Luciferase). 
The  PIK3R1  mutations  were  assigned  as  “SA”  if  the  mutations  have  a  relative  level  of  activation  higher  than  that  of  PIK3CA 
Ml  0431  comparing  to  negative  controls;  as  “MA”  if  the  mutations  have  a  relative  level  of  activation  between  PIK3CA  Ml  0431  and 
WT;  as  “Weak  activating  (WA)”  if  the  mutations  have  a  relative  level  of  activation  between  PIK3CA  WT  and  negative  controls;  or 
as  “NDFW”  if  the  mutations  have  a  similar  activity  with  WT.  In  MCF10A,  the  PIK3CA  mutations  were  assigned  as  “SA”  and 
“NDFW”  by  the  same  mean  as  in  Ba/F3  model.  The  mutations  were  assigned  as  “MA”  and  “WA”  if  the  mutations  have  an  activity 
above  and  lower  than  50%  of  that  of  Ml  0431,  respectively. 

Tumor  Classes  by  Gene  Alteration 

Genetic/genomic  alteration  classes  in  relation  to  PI3K/AKT/mTOR  pathway  alteration  were  defined  (Figure  7A),  in  order  to  relate 
these  to  PI3K/AKT  and  mTOR  activation,  as  defined  by  protein  signature  score.  For  AKT/MTOR/PIK3CA/PIK3R1/PTEN  mutations, 
“predicted  functional”  mutations  from  Figure  3D  were  used.  An  “other  gene  mutation”  class  of  Figures  7 A  and  7B  involved  nonsilent 
mutations  for  other  genes  represented  in  Figure  2A  (AKTS1 ,  DEPDC5,  DEPTOR,  MAPKAP1 ,  MLST8,  NPRL2,  NPRL3,  PDK1 ,  PRR5, 
RHEB,  RICTOR,  RPTOR,  PIK3C2B).  The  RTK  group  represented  cases  with  hotspot  mutations  in  KRAS,  BRAF,  EGFR,  or  ERBB2, 
that  were  not  also  included  in  the  other  PI3K/AKT/mTOR-related  groups.  The  set  of  genes  previously  found  significantly  mutated 
in  pan-cancer  analysis  (Lawrence  et  al.,  2014),  were  searched  for  enrichment  of  mutation  events  within  the  High  P-AKT  group  (Fig¬ 
ure  7C).  When  defining  proteins  that  were  highly  expressed  specifically  within  the  High  P-AKT  group  (Figure  7D),  RPPA  features  were 
selected  that  were  over-  or  under-expressed  in  the  High  P-AKT  compared  to  unaligned  cases  (p<0.05,  t-test  on  log-transformed 
data)  for  at  least  four  of  the  seven  cancer  types,  and  differentially  expressed  in  High  P-AKT  compared  to  unaligned  and  to  PI3K/ 
AKT/mTOR  or  RTK-altered  cases  across  all  cancer  cases  (p<0.01  for  each). 

QUANTIFICATION  AND  STATISTICAL  ANALYSIS 

All  p  values  were  two-sided  unless  otherwise  specified.  Statistical  significance  was  defined  at  the  0.05  threshold.  All  available  TCGA 
data  in  the  public  domain  at  the  time  of  this  study  was  utilized,  and  no  patients  were  deliberately  excluded.  Differential  expression 
between  comparison  groups  was  assessed  using  t-test  on  log-transformed  values.  For  visualization  using  heat  maps  and  box  plots, 
mRNA  and  protein  expression  values  were  z-normalized  to  standard  deviations  from  the  median  across  all  tumor  sample  profiles. 

Individual  gene  and  protein  features  were  evaluated  for  correlation  with  patient  survival  by  univariate  Cox  analysis;  in  addition,  a 
stratified  Cox  model  was  used  to  evaluate  survival  association  when  correcting  for  tumor  type.  For  PTEN  and  STK1 1  copy  alteration 
features  (Log  [tumor/normal]  ratios),  Cox  regression  analysis  within  each  individual  cancer  type  was  carried  out;  then,  in  order  to 
aggregate  the  results  across  cancer  types,  we  used  “metafor”  R  package  to  conduct  meta-analyses,  with  a  random-effects  model 
used  to  estimate  the  overall  effectiveness  of  the  molecular  feature.  For  Kapan-Meier  plots,  a  stratified  Log-rank  test  evaluated  dif¬ 
ferences  between  tumor  groups  after  correction  for  tumor  type.  Patient  survival  data  from  TCGA  were  current  as  of  March  31 , 201 6. 
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